I do not think it is the problem of ARMV8 library. Because the different between them is the compiler, they are using the same SDK source code.
Maybe you can use our demo to make a small test first.
I will continue to investigate this small issue.
I continue my work with Jetson Nano. For information, Cuda programming with Python is ok. In fact, i have to use PyCuda library to do this. It's a kind of Python programming but in fact, it allows you to insert C code in Python program which is compiled with nvcc compiler.
It works well.
The algorithm was applied with Jetson Nano (it can perform a real time filtering with a 1280*720 25 fps). It is just a test but i think it is interesting.
You can see it here :
just for information about Nvidia Jetson Nano :
yesterday, i have compiled CUDA examples (from Cuda toolkit) to see how Maxwell GPU perform.
I tried some 3D and image filtering demos. The result is just amazing. It is far far far more powerful than classic opencv routines.
Of course, Cuda is not easy to manage but the performances are really impressive.
With that kind of SBC, you can manage very complex programs with very good performances.
You should try Jetson Nano Cuda examples.
If one day you plan to get a really powerful ASIAIR, SBC like Jetson Nano can bring you the power you need.
still working on Jetson Nano and PyCuda.
I have found at work a 5VDC 2.5A power supply to temporary solve my problems with my USB 5VDC power supply (Jetson Nano crash with it and the hot weather does not help very match).
First, i made a better management of blocks and threads number. Now, i get 2D blocks and 2D grid.
I use about 10 filters with my sky survey software. For now, i have succeed to convert 8 of them (using naive method of course, i am a very beginner with Jetson Nano).
Numpy routines are quite easy to use with PyCuda.
Filters like blur, Gaussian blur or sharpen needs convolution filter with different kernels so i had to write a convolution filter using PyCuda. Not a big deal.
I made tests with classical method (numpy + opencv) and PyCuda method.
For a 3096*2080 pixels picture (IMX178 resolution) using 8 filters (maximum load) :
Numpy + opencv 4.1.0 : 7.4 seconds
PyCuda : 1.04 seconds
For a 1544*1040 pixels picture (BIN 2 camera) using 4 filters (typical load) :
Numpy + opencv 4.1.0 : 1.2 seconds
PyCuda : 0.21 seconds
PyCuda out perform Numpy + openCV 4.1.0 (5.7x more speed).
I did not make a test with opencv 3.3 but other tests i made before show opencv 4.1.0 (compiled with opencv sources and Jetson Nano) make me think that PyCuda is 10 to 15 time faster than opencv 3.3.
Now, i need to work on more complex filters like NLM Denoise. I saw the Cuda example but to be honest, it's quite hard to understand. I think i will need help from Nvidia to write such filter with PyCuda.
When all my filters will be ok, i will put them in my sky survey software to make tests with ASI178MC and the Jetson Nano.
Something different : do you have (good) news about a new camera with sensor which could match my project ?
Have a nice day.
It seems that your progress has been huge.
For the new camera, sorry. there is no good news now. I think you still need to wait for another several month.
many thanks for your reply.
Yes, things are going quite fast with the Nano. Special thanks to pycuda. It makes Cuda more easy to use but of course, it will be slower than pure Cuda programming. For now, pycuda will be enough.
No problem with the camera. I will wait. The most important thing is to get best camera for my project.
Have a nice day.
All the filters work quite well and the real time treatments are much faster than previous SkyPi version (Numpy + OpenCV mainly).
I made a small video of the first indoor test of SkyNano.
During the test, camera gain and exposure time are always the same.
When filters are OFF, we see quite nothing.
When filters are ON, we see quite well.
During the test, i look at the Maxwell GPU activity :
- when filters are ON, the GPU activity is high
- when filters are OFF, the GPU activity is low
The link to see the video :
Now, i must make tests outdoor.
i have seen things about RPi 4 and i don't think i will give it a try.
First, there is an issue with USB-C so i think it is more reasonable to wait for new RPi 4 without this issue.
The CPU (from what i can remember it's ARM A72) is interesting but the heat will be a important problem. Without a good heat sink and probably a fan, RPi4 CPU is much more too hot when you use it at full power for a while.
Considering my experience with other SBC like Odroid N2 and Jetson Nano, RPi is slower than those 2 boards so it is not that much interesting.
For ASIAIR 2, RPi could be interesting because it is much faster than RPi 3 B+ (and the 4 Go RAM is also good) but you will have to manage the heat dissipation because without heat sink, you will have many issues with RPi 4.
For full power, Jetson Nano is really good. I have succeed to rewrite all my treatment routines with PyCuda and Cuda and now, Jetson Nano is about 5 times faster than Odroid N2.
For now, it is Python writing with PyCuda so it is always slower than pure C and Cuda programming. I first want to get a full working Python/PyCuda software and when everything will be ok, i will try to rewrite my software using C++.