Knowing that Python is very slow compared to Java and C++, why do they mostly use Python for fast algorithmic procedures like machine lea…

Knowing that Python is very slow compared to Java and C++, why do they mostly use Python… by Trausti Thor Johannsson

Answer by Trausti Thor Johannsson:

I think your premise is not quite right, not totally wrong.

Just because a compiled program is way faster, does not mean interpreted one is very slow. This might have been an issue in the 1990s, but it really isn’t today.

Let me give you a few examples.

I get a huge list of comma separated values (CSV), I mean it is many tens of megabytes. Not all values are filled in and so forth. Then you have a guest field that might have zero to 4 extra people and so forth, you need to do some basic calculations on this.

Writing such a program in python just takes me a few minutes, not even 10 minutes. To run this, only takes 2–3 seconds and I have my results, nicely formatted and perfect.

Of course this would also just take me 20 minutes or so to do in C, plain C, but knowing my own programming, and a few faulty data inputs, it would take me a bit longer to write and get the results. So at least I saved myself probably 30 minutes doing this in python, and it would still take me about 2 seconds to run through the list.

So what did I save there using a programming language which is as fast as they get?

Of course I could have used libraries, Cocoa or Boost or something else in C/C++, but unless I was very warm and using C++ all day, it would still take me much longer time to do than in Python.

Now lets make things a bit more difficult. I have 4 different web services to speak to, get data from and crunch them, save them into my database and all 4 data sets would be merged, so each item from each data site would be 1 item in the database. This could be a people directory, car registry, loan and liens registered on the person and the car and finally the police directory to see if the person and car is registered correctly and does not have any tickets and the person has a valid driving license.

I can say I would be done in 30–40 minutes in python, having run the program and sure because of network issues this would have taken 20 seconds to run or so. All data collected and stored.

I am not sure how long this would take me on C, Objective C or C++, but it would be quite a lot longer unless I was using a lot of libraries and other tools. The resulting program would still take about the exact same time to run, due to network and database latency.

So what time did I save there using a much faster program ?

Then my boss comes to my office and tells me, you need to add a few more data sources, like speak to the dealer and get all extra equipment on the car and check if it has done all the services that are required for insurance.

That work would take me 10 minutes, 15 minutes because I had a cup of coffee perhaps in Python. It would not take long to run, at all.

This might have made me need to rewrite and refactor a lot of code in the faster languages, and it sure as I am typing this would take a lot longer to write.

This is why a lot of people use Python, Ruby, Perl and other languages to do their work, fast, reliably and without much fuss, as it does not matter at all of they use 1 minute to run or 5 seconds, it makes no difference.

Now on the other hand if you are making a game engine, or a ray tracer, I would never use Python. I would use the fastest language available, probably some relative to C. I would never be able to make a real 3D game in anything less. Sure I could use Lua for a lot, but that is more because of how Lua works with C and besides the point.

If I was writing a program to sequence DNA and do a lot of tests on that and it would be massively threaded and every hour the task would run, would cost tens of thousands of dollars, so every second counts, I would use C/C++/Objective C or something similar. Because even if it would take me a week to write the software, it would still be less than $10 000 it would cost to run every hour. Then my sequencer runs for 2 weeks, the total cost would be millions of dollars. Had I written my code in Python, the cost of that job could have been in the tens of millions if not more than hundred million dollars. By then you see no matter if I used a year to write the sequencer, it would be a drop in the ocean compared to the total cost of everything, and every second counts.

So both languages have their jobs to do, very different jobs, though they both solve the same things, take data in and spit data out. You have to pick the right tools. If you think Python is always the right tool or C/C++ is always correct for any job, you are a lousy programmer. It is said that when all you have is a hammer, all problems look like nails. The issue is that not all problems are nails. And today, in the latter half of 2016, you have all these wonderful tools, try to pick the right tool for the job.

All computers are extremely fast, even the smallest 5W cpu on the Macbook which I have on many occasions said is a useless computer, it is very fast compared to a machine from 10 years ago, or even my G4 Powerbook from 2003. The macbook is insanely fast compared to that machine, even though I still consider it to be a worthless machine. But that is me.

I hope this explained this issue to you. If you have any more question in relation to this, please ask away in the comments.

Knowing that Python is very slow compared to Java and C++, why do they mostly use Python for fast algorithmic procedures like machine lea…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s