Compare image processing times for each language processing system and library

For each language processing system(JavaScript, Python, C++, Java, C#.NET) and library(OpenCV, Pillow), I measured the time it took to perform simple image processing.

Processing

Color images are converted to grayscale and binarized.

Original image

Result Image

Grayscale conversion use the following BT.601 formula, which is consistent with OpenCV's internal processing.
```
 0.299 * R + 0.587 * G + 0.114 * B 
```
The binarization threshold is 128.

Implement

I reproduced the implementation that is typical for each processing system. It is not completely identical processing, but the input/output data format and detailed processing methods (whether or not memory is allocated, rounding, etc.) are different.
The implementation of the time measurement range is shown below. Image input and output was performed before and after this, and it was confirmed that grayscale conversion and binarization were performed correctly.

JavaScript / Edge,Chrome (Data Format: ImageData)

 const binTh = 128;
 const data = imgData.data
 for (let i=0; i<data.length; i+=4){
  	const gry = Math.round(0.299 * data[i] + 0.587 * data[i+1] + 0.114 * data[i+2]);
  	data[i] = data[i+1] = data[i+2] = (gry < binTh) ? 0 : 255;
 }

Pillow / Python (Data Format: Image)

 bin_th = 128
 gray = img.convert("L")
 binary = gray.point(lambda x: 255 if x > bin_th else 0)

Numpy / Python (Data Format: NumPy array)

 bin_th = 128
 gray = np.rint(0.299 * img[:, :, 2] + 0.587 * img[:, :, 1] + 0.114 * img[:, :, 0])
 binary = np.where((gray > bin_th) , 255, 0)

C++ (Data Format: 4byte integer array)

 #define Colorref2Red(RGB) ((unsigned char)(RGB)) 
 #define Colorref2Green(RGB) ((unsigned char)(((unsigned long) (RGB)) >> 8)) 
 #define Colorref2Blue(RGB) ((unsigned char)((RGB) >> 16)) 
 #define Rgb2Colorref(r, g ,b)  ((unsigned long)(((unsigned char)(r) | \
     (((unsigned short)(unsigned char)(g)) << 8)) | \
     (((unsigned long)(unsigned char)(b)) << 16)))
 const int binTh = 128;
 int cnt = width * height;
 unsigned long* ptr = img;
 while (cnt--) {
	const int bin = (std::round(0.299 * Colorref2Red(*ptr) + 0.587 * Colorref2Green(*ptr) + 0.114 * Colorref2Blue(*ptr)) < binTh) ? 0 : 255;
	*ptr++ = Rgb2Colorref(bin, bin, bin);
 }

Java (Data Format: BufferedImage)

 int width = img.getWidth();
 int height = img.getHeight();
 int binTh = 128;
 for (int y = 0; y < height; y++) {
	for (int x = 0; x < width; x++) {
		int col = img.getRGB( x, y );
		int gray =  (int)(0.299 * (double)((col >> 16)&0xff) + 0.587 * (double)((col >> 8)&0xff) + 0.114 * (double)(col&0xff));
		int binary = (gray < binTh) ? 0 : 255;
		img.setRGB( x, y, binary << 16 | binary << 8 | binary );
	}
 }

C#.NET (Data Format: Bitmap)

 int binTh = 128;
 for (int y = 0; y < bitmap.Height; y++)
 {
	for (int x = 0; x < bitmap.Width; x++)
	{
		Color col = bitmap.GetPixel(x, y);
		int gray = (int)((col.R * 0.299) + (col.G * 0.587) + (col.B * 0.114));
		int bin = (gray < binTh) ? 0 : 255;
		bitmap.SetPixel(x, y, Color.FromArgb(bin, bin, bin));
	}
 }

OpenCV.js / JavaScript / Edge,Chrome (Data Format: Mat)

 const binTh = 128;
 cv.cvtColor(src, dst, cv.COLOR_RGBA2GRAY, 0);
 cv.threshold(dst, binary, binTh, 255, cv.THRESH_BINARY);

OpenCV / Python (Data Format: Mat)

 bin_th = 128
 gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
 ret, binary  = cv2.threshold(gray, bin_th, 255, cv2.THRESH_BINARY)

OpenCV / C++ (Data Format: Mat)

 const int binTh = 128;
 cvtColor(img, gray, COLOR_BGR2GRAY);
 threshold(gray, binary, binTh, 255, THRESH_BINARY);

OpenCV / Java (Data Format: Mat)

 double minTh = 128;
 Imgproc.cvtColor(img, gray, Imgproc.COLOR_BGR2GRAY);
 Imgproc.threshold(gray, binary, minTh, 255, Imgproc.THRESH_BINARY);

OpenCVSharp / C#.NET (Data Format: Mat)

 int binTh = 128;
 Mat gray = img.CvtColor(ColorConversionCodes.BGR2GRAY);
 Mat binary = gray.Threshold(binTh, 255, ThresholdTypes.Binary);

Processing time measurement

The environment is Windows / Intel Core i7-1260P 2.1GHz.
The execution date is October 2024.

Test Image

4094x3780pixel

7230x5428pixel

14364x11356pixel

※ The 4094x3780pixel image is an aerial photograph taken by the Geospatial Information Authority of Japan.

Processing time measurement value (unit: ms)

Processing time graph display

Test images (Graph showing processing time for selected images) 4094x3780pixel 7230x5428pixel 14364x11356pixel

※ C#.NET processing time does not fit in the graph drawing area.

Feel

It is important to note that because the input/output data formats and detailed processing methods differ, the data does not allow for a pure comparison of the superiority or inferiority of processing systems, but it does give a sense of the general trends.
The reason why JavaScript processing times are roughly the same in Edge and Chrome is likely because they are both based on Chromium and use the same JavaScript engine, Google V8.
When using OpenCV, the processing time differs between OpenCV.js used in the browser and other uses, but it is faster than when OpenCV is not used.
If you use OpenCV other than OpenCV.js, the processing time will be roughly the same regardless of the processing system. Using OpenCV from C++ and Java is the fastest, but the difference is not significant, so it is best to use a programming environment that is easy to use on the target platform.
OpenCV.js is a WebAssembly subset of OpenCV, and runs within a browser which has resource constraints, so it is expected that performance will be reduced. In this experiment, to compare performance, a huge image that did not fit on the screen was processed at its original resolution, but this could be said to be unnatural for processing within a browser. Incidentally, the sample code in the OpenCV.js article on the official OpenCV website runs quickly because it is implemented to reduce the image to the screen size before processing it.
While the core of language processing systems has always been "optimization," the importance of program "productivity" has increased, and emphasis is now being placed on "programming environments" including libraries. When implementing general processing that uses a combination of algorithms devised by our predecessors, it is a good idea to leave type checking and memory management to the language processing system, and implement the required functions using a high-speed library. When developing a completely new algorithm from scratch, you cannot use existing libraries, so you have no choice but to make it yourself. Even in this case, if it is implemented in C/C++, it can be directly called and used from most languages using a mechanism called FFI or Binding, making it possible to develop with high performance and high productivity.