#Fastcompression Fastvideo SDK Jetson
Explore tagged Tumblr posts
Text
Jetson TX2 and AGX Xavier performance comparison
Performance comparison for Jetson TX2 and AGX Xavier
Imaging applications benefit from the latest NVIDIA mobile GPUs: Jetson TX2 and AGX Xavier. Nevertheless, general benchmarks can't answer the question about performance comparison for the latest NVIDIA Jetson hardware. Anyway, this is very practical issue for many imaging applications, including aerial imaging, UAV, robotics, self-driving cars, etc. To provide you with real numbers, we've done comparative studies with Fastvideo SDK, which has lots of image processing modules for camera applications. and this SDK is compatible with full line of Jetson hardwdare.

How we've done Jetson TX2 vs Xavier performance comparison
We've done time measurements for most frequently used image processing algorithms like demosaic, resize, denoise, jpeg encoder and decoder, jpeg 2000 codec, etc. This is just a small part of Fastvideo SDK modules, though they could be valuable to understand the performance speedup with Jetson AGX Xavier.
We've utilized the same images and the same parameters for comparison. Xavier boost is very important issue, because in many cases of camera applications we could switch from offline to realtime mode of operation. This is also viable for multiple camera systems on Jetson Xavier.
We can conclude that performance speedup is in the range of 1.7 - 3 for imaging applications on Jetson. This is impressive boost for practitioners. Quite often the results of raw image processing go further as the input for AI applications, which have also been significantly boosted by new Volta hardware cores on Jetson AGX Xavier.

Original article see here: https://www.fastcompression.com/blog/xavier-vs-tx2.htm
0 notes
Text
Fastvideo SDK vs NVIDIA NPP Library
Author: Fyodor Serzhenko
Why is Fastvideo SDK better than NPP for camera applications?
What is Fastvideo SDK?
Fastvideo SDK is a set of software components which correspond to high quality image processing pipeline for camera applications. It covers all image processing stages starting from raw image acquisition from the camera to JPEG compression with storage to RAM or SSD. All image processing is done completely on GPU, which leads to real-time performance or even faster for the full pipeline. We can also offer a high-speed imaging SDK for non-camera applications on NVIDIA GPUs: offline raw processing, high performance web, digital cinema, video walls, FFmpeg codecs and filters, 3D, AR/VR, AI, etc.
Who are Fastvideo SDK customers?
Fastvideo SDK is compatible with Windows/Linux/ARM and is mostly intended for camera manufacturers and system integrators developing end-user solutions containing video cameras as a part of their products.
The other type of Fastvideo SDK customers are developers of new hardware or software solutions in various fields: digital cinema, machine vision and industrial, transcoding, broadcasting, medical, geospatial, 3D, AR/VR, AI, etc.
All the above customers need faster image processing with higher quality and better latency. In most cases CPU-based solutions are unable to meet such requirements, especially for multicamera systems.
Customer pain points
According to our experience and expertise, when developing end-user solutions, customers usually have to deal with the following obstacles.
Before starting to create a product, customers need to know the image processing performance, quality and latency for the final application.
Customers need reliable software which has already been tested and will not glitch when it is least expected.
Customers are looking for an answer on how to create a new solution with higher performance and better image quality.
Customers need external expertise in image processing, GPU software development and camera applications.
Customers have limited (time/human) resources to develop end-user solutions bound by contract conditions.
They need a ready-made prototype as a part of the solution to demonstrate a proof of concept to the end user.
They want immediate support and answers to their questions regarding the fast image processing software's performance, image quality and other technical details, which can be delivered only by industry experts with many years of experience.
Fastvideo SDK business benefits
Fastvideo SDK as a part of complex solutions allows customers to gain competitive advantages.
Customers are able to design solutions which earlier may have seemed to be impossible to develop within required timeframes and budgets.
The product helps to decrease the time to market of end-user solutions.
At the same time, it increases overall end-user satisfaction with reliable software and prompt support.
As a technology solution, Fastvideo SDK improves image quality and processing performance.
Fastvideo serves customers as a technology advisor in the field of fast image processing: the team of experts provides end-to-end service to customers. That means that all customer questions regarding Fastvideo SDK, as well as any other technical questions about fast image processing are answered in a timely manner.
Fastvideo SDK vs NVIDIA NPP comparison
NVIDIA NPP can be described as a general-purpose solution, because the company implemented a huge set of functions intended for applications in various industries, and the NPP solution mainly focuses on various image processing tasks. Moreover, NPP lacks consistency in feature delivery, as some specific image processing modules are not presented in the NPP library. This leads us to the conclusion that NPP is a good solution for basic camera applications only. It is just a set of functions which users can utilize to develop their own pipeline.
Fastvideo SDK, on the other hand, is designed to implement a full 16/32-bit image processing pipeline on GPU for camera applications (machine vision, scientific, digital cinema, etc). Our end-user applications are based on Fastvideo SDK, and we collect customer feedback to improve the SDK’s quality and performance. We are armed with profound knowledge of customer needs and offer an exceptionally reliable and heavily tested solution.
Fastvideo uses a specific approach in Fastvideo SDK which is based on components (not on functions as in NPP). It is easier to build a pipeline based on components, as the components' input and output are standardized. Every component executes a complete operation, and it can have a complex architecture, whereas NPP only uses several functions. It is important to emphasize here that developing an application using built-in Fastvideo SDK is much less complex than creating a solution based on NVIDIA NPP.
The Fastvideo JPEG codec and lots of other SDK features have been heavily tested by our customers for many years with a total performance benchmark of more than million images per second. This is a question of software reliability, and we consider it as one of our most important advantages.
The major part of the Fastvideo SDK components (debayer and codecs) can offer both high performance and image quality, leaving behind the NPP alternatives. What’s more, this is also true for embedded solutions on Jetson where computing performance is quite limited. For example, NVIDIA NPP only has a bilinear debayer, so it can be regarded as a low-quality solution, best suited only for software prototype development.
Summing up this section, we need to specify the following technological advantages of the Fastvideo SDK over NPP in terms of image processing modules for camera applications:
High-performance codecs: JPEG, JPEG2000 (lossless and lossy)
High-performance 12-bit JPEG encoder
Raw Bayer Codec
Flat-Field Correction together with dark frame subtraction
Dynamic bad pixel suppression in Bayer images
Four high quality demosaicing algorithms
Wavelet-based denoiser on GPU for Bayer and RGB images
Filters and codecs on GPU for FFmpeg
Other modules like color space and format conversions
To summarize, Fastvideo SDK offers an image processing workflow which is standard for digital cinema applications, and could be very useful for other imaging applications as well.
Why should customers consider Fastvideo SDK instead of NVIDIA NPP?
Fastvideo SDK provides better image quality and processing performance for implementing key algorithms for camera applications. The real-time mode is an essential requirement for any camera application, especially for multi-camera systems.
Over the last few years, we've tested NPP intensely and encountered software bugs which weren't fixed. In the meantime, if customers come to us with any bug in Fastvideo SDK, we fix it within a couple of days, because Fastvideo possesses all the source code and the image processing modules are implemented by the Fastvideo development team. Support is our priority: that's why our customers can rely on our SDK.
We offer custom development to meet specific our customers' requirements. Our development team can build GPU-based image processing modules from scratch according to the customer's request, whereas in contrast NVIDIA provides nothing of the kind.
We are focused on high-performance camera applications and we have years of experience, and our solutions have been heavily tested in many projects. For example, our customer vk.com has been processing 400,000 JPG images per second for years without any issue, which means our software is extremely reliable.
Software downloads to evaluate the Fastvideo SDK
GPU Camera Sample application with source codes including SDKs for Windows/Linux/ARM - https://github.com/fastvideo/gpu-camera-sample
Fast CinemaDNG Processor software for Windows and Linux - https://www.fastcinemadng.com/download/download.html
Demo applications (JPEG and J2K codecs, Resize, MG demosaic, MXF player, etc.) from https://www.fastcompression.com/download/download.htm
Fast JPEG2000 Codec on GPU for FFmpeg
You can test your RAW/DNG/MLV images with Fast CinemaDNG Processor software. To create your own camera application, please download the source codes from GitHub to get a ready solution ASAP.
Useful links for projects with the Fastvideo SDK
1. Software from Fastvideo for GPU-based CinemaDNG processing is 30-40 times faster than Adobe Camera Raw:
http://ir-ltd.net/introducing-the-aeon-motion-scanning-system
2. Fastvideo SDK offers high-performance processing and real-time encoding of camera streams with very high data rates:
https://www.fastcompression.com/blog/gpixel-gmax3265-image-sensor-processing.htm
3. GPU-based solutions from Fastvideo for machine vision cameras:
https://www.fastcompression.com/blog/gpu-software-machine-vision-cameras.htm
4. How to work with scientific cameras with 16-bit frames at high rates in real-time:
https://www.fastcompression.com/blog/hamamatsu-orca-gpu-image-processing.htm
Original article see at: https://www.fastcompression.com/blog/fastvideo-sdk-vs-nvidia-npp.htm Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
0 notes
Text
JPEG Optimizer Library on CPU and GPU
Fastvideo has implemented the fastest JPEG Codec and Image Processing SDK for NVIDIA GPUs. That software could work at maximum performance with full range of NVIDIA GPUs, starting from mobile Jetson to professional Quadro and Tesla server GPUs. Now we've extended these solutions to be able to offer various optimizations to Standard JPEG algorithm. This is vitally important issue to get better image compression while retaining the same perceived image quality within existing JPEG Standard.
Our expert knowledge in JPEG Standard and GPU programming are proved by performance benchmarks of our JPEG Codec. This is also a ground for our custom software design to solve various time-critical tasks in connection with JPEG images and corresponding services.
Our customers have been utilizing that GPU-based software for fast JPEG encoding and decoding, JPEG resize for high load web applications and they asked us to implement more optimizations which are indispensable for web solutions. These are the most demanding tasks:
JPEG recompression to decrease file size without loosing perceived image quality
JPEG optimization to get better user experience while loading JPEG images via slow connection
JPEG processing on users' devices
JPEG resize on-demand:
Implementations of JPEG Baseline, Extended, Progressive and Lossless parts of the Standard
Other tasks related to JPEG images
to store just one source image (to cut storage costs)
to match resolution of user's device (to exclude JPEG Resize on user's device)
to minimize traffic
to ensure minimum server response time
to offer better user experience
The idea about image optimization is very popular and it really makes sense. As soon as JPEG is so widespread at web, we need to optimize JPEG images for web as well. By decreasing image size, we can save space for image storage, minimize traffic, improve latency, etc. There are many methods of JPEG optimization and recompression which could bring us better compression ratio while saving perceptual image quality. In our products we strive to combine all of them with the idea about better performance on multicore CPUs and on modern GPUs.
There is a great variety of image processing tasks which are connected with JPEG handling. They could be solved either on CPU or on GPU. We are ready to offer custom software design to meet special requirements that our customers could have. Please fill the form below and send us your task description.
JPEG Optimizer Library and other software from Fastvideo
JPEG Optimizer Library (SDK for GPU/CPU on Windows/Linux) to recompress and to resize JPEG images for corporate customers: high load web services, photo stock applications, neural network training, etc.
Standalone JPEG optimizer application - in progress
Projects under development
JPEG optimizer SDK on CPU and GPU
Mobile SDK on CPU for Android/IOS for image decoding and visualization on smartphones
JPEG recompression library that runs inside your web app and optimizes images before upload
JPEG optimizer API for web
Online service for JPEG optimization
Fastvideo publications on the subject
JPEG Optimization Algorithms Review
Web resize on-the-fly on GPU
JPEG resize on-demand: FPGA vs GPU. Which is the fastest?
Jpeg2Jpeg Acceleration with CUDA MPS on Linux
JPEG compress and decompress with CUDA MPS
Original article see at: https://www.fastcompression.com/products/jpeg-optimizer-library.htm
Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
0 notes
Text
Benchmark comparison for Jetson Nano, TX2, Xavier NX and AGX
Author: Fyodor Serzhenko
NVIDIA has released a series of Jetson hardware modules for embedded applications. NVIDIA® Jetson is the world's leading embedded platform for image processing and DL/AI tasks. Its high-performance, low-power computing for deep learning and computer vision makes it the ideal platform for mobile compute-intensive projects.
We've developed an Image & Video Processing SDK for NVIDIA Jetson hardware. Here we present performance benchmarks for the available Jetson modules. As an image processing pipeline, we consider a basic camera application as a good example for benchmarking.
Hardware features for Jetson Nano, TX2, Xavier NX and AGX Xavier
Here we present a brief comparison for Jetsons hardware features to see the progress and variety of mobile solutions from NVIDIA. These units are aimed at different markets and tasks
Table 1. Hardware comparison for Jetson modules
In camera applications, we can usually hide Host-to-Device transfers by implementing GPU Zero Copy or by overlapping GPU copy/compute. Device-to-Host transfers can be hidden via copy/compute overlap.
Hardware and software for benchmarking
CPU/GPU NVIDIA Jetson Nano, TX2, Xavier NX and AGX Xavier
OS L4T (Ubuntu 18.04)
CUDA Toolkit 10.2 for Jetson Nano, TX2, Xavier NX and AGX Xavier
Fastvideo SDK 0.16.4
NVIDIA Jetson Comparison: Nano vs TX2 vs Xavier NX vs AGX Xavier
For these NVIDIA Jetson modules, we've done performance benchmarking for the following standard image processing tasks which are specific for camera applications: white balance, demosaic (debayer), color correction, resize, JPEG encoding, etc. That's not the full set of Fastvideo SDK features, but it's just an example to see what kind of performance we could get from each Jetson. You can also choose a particular debayer algorithm and output compression (JPEG or JPEG2000) for your pipeline.
Table 2. GPU kernel times for 2K image processing (1920×1080, 16 bits per channel, milliseconds)
Total processing time is calculated for the values from the gray rows of the table. This is done to show the maximum performance benchmarks for a specified set of image processing modules which correspond to real-life camera applications.
Each Jetson module was run with maximum performance
MAX-N mode for Jetson AGX Xavier
15W for Jetson Xavier NX and Jetson TX2
10W for Jetson Nano
Here we've compared just the basic set of image processing modules from Fastvideo SDK to let Jetson developers evaluate the expected performance before building their imaging applications. Image processing from RAW to RGB or RAW to JPEG are standard tasks, and now developers can get detailed info about expected performance for the chosen pipeline according to the table above. We haven't tested Jetson H.264 and H.265 encoders and decoders in that pipeline. As soon as H.264 and H.265 encoders are working at the hardware level, encoding can be done in parallel with CUDA code, so we should be able to get even better performance.
We've done the same kernel time measurements for NVIDIA GeForce and Quadro GPUs. Here you can get the document with the benchmarks.
Software for Jetson performance comparison
We've released the software for a GPU-based camera application on GitHub, and it's available to download both binaries and source codes for our gpu camera sample project. It's implemented for Windows 7/10, Linux Ubuntu 18.04 and L4T. Apart from a full image processing pipeline on GPU for still images from SSD and for live camera output, there are options for streaming and for glass-to-glass (G2G) measurements to evaluate real latency for camera systems on Jetson. The software currently works with machine vision cameras from XIMEA, Basler, JAI, Matrix Vision, Daheng Imaging, etc.
To check the performance of Fastvideo SDK on a laptop/desktop/server GPU without any programming, you can download Fast CinemaDNG Processor software with GUI for Windows or Linux. That software has a Performance Benchmarks window, and there you can see timing for each stage of image processing. This is a more sofisticated method of performance testing, because the image processing pipeline in that software can be quite advanced, and you can test any module you need. You can also perform various tests on images with different resolutions to see how much the performance depends on image size, content and other parameters.
Other blog posts from Fastvideo about Jetson hardware and software
Jetson Image Processing
Jetson Zero Copy
Jetson Nano Benchmarks on Fastvideo SDK
Jetson AGX Xavier performance benchmarks
JPEG2000 performance benchmarks on Jetson TX2
Remotely operated walking excavator on Jetson
Low latency H.264 streaming on Jetson TX2
Performance speedup for Jetson TX2 vs AGX Xavier
Source codes for GPU-Camera-Sample software on GitHub to connect USB3 and other cameras to Jetson
Original article see at: https://www.fastcompression.com/blog/jetson-benchmark-comparison.htm
Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
0 notes
Text
J2K codec performance on Jetson TX2
NVIDIA Jetson TX2 hardware is very promising for imaging and other embedded applications. That high-performance and low-power hardware is utilized in autonomous solutions, especially the industrial version Jetson TX2i. Since J2K compression is a common task for UAV (Unmanned Aerial Vehicle) applications, here we evaluate such a solution and its limitations.
Detailed info concerning our testing approach for JPEG2000 encoding and decoding on desktop/server NVIDIA GPUs you can find at the corresponding links. Here we follow exactly the same procedure, but it's applied to the Jetson hardware.
J2K encoding/decoding parameters
File format – JP2
Lossy JPEG2000 compression with CDF 9/7 wavelet
Lossless JPEG2000 compression with CDF 5/3 wavelet
Compression ratio (for lossy algorithm) ~ 12.0:1 which corresponds to visually lossless encoding
Subsampling mode – 4:4:4
Number of DWT resolutions – 7
Codeblock size – 32×32
MCT – on
PCRD – off
Tiling – off
Window – off
Quality layers – one
Progression order – LRCP (L = layer, R = resolution, C = component, P = position)
Modes of operation – single or multithreaded batch
2K test image (24-bit) – 2k_wild.ppm
4K test image (24-bit) – 4k_wild.ppm
It's obvious that in many cases compression ratio for visually lossless encoding could be much higher for JPEG2000 algorithm. So we would suggest testing different parameters to achieve the best compression ratio with an acceptable image quality. Decreasing the quality coefficient one can get not only better compression, but also higher framerate both for encoding and decoding. Our benchmarks show the performance results for the above images and parameters. It's not the maximum performance, which could be better in many other cases.
Hardware and software
NVIDIA Jetson TX2
CUDA Toolkit 10.2
JPEG2000 codec benchmarks on NVIDIA Jetson TX2
Jetson TX2 has 4-core ARM Cortex-A57 @ 2 GHz and 2-core Denver2 @ 2 GHz. These two types of cores have different performance, which should be taken into account. Since Tier-2 stage of JPEG2000 algorithm is implemented on CPU, the performance of both CPU and GPU cores determine the framerate. From that point of view, multithreading can be useful (we use up to 12 threads), but in the single mode we could get different performance depending on the CPU core used. So in the single mode we need to set affinity mask to ensure utilizing the fastest CPU core.
In the tests discussed we've restricted memory usage to 2 GB. This was done under an assumption that Jetson TX2 can have only 4 GB memory, so this is important limitation for the whole image processing solution.
Here we haven't considered the task of J2K transcoding to H.264 on Jetson. That task requires additional tests, though from our previous experience with desktop/server GPUs, performance of the transcoding should not differ significantly, because Jetson has hardware support of H.264 encoding (separate from GPU), which is accessible via V4L2 interface and can be used simultaneously with JPEG2000 decoder.
By request we could offer Fastvideo SDK for Jetson for evaluation - please fill the form below and send it to us.
Other info from Fastvideo concerning JPEG2000 and Jetson
JPEG2000 codec on GPU
JPEG2000 vs JPEG vs PNG: What's the Difference?
J2K encoding benchmarks
J2K decoding benchmarks
Fast FFmpeg J2K decoder on NVIDIA GPU
MXF Player
Jetson Benchmark Comparison: Nano vs TX2 vs Xavier
Jetson image processing for camera applications
Original article see at: https://www.fastcompression.com/blog/j2k-codec-on-jetson-tx2.htm
Subscribe to our mail list: https://mailchi.mp/fb5491a63dff/fastcompression
0 notes