[Discuss] 'C' string tokenizer for those who hate strtok
Alan W. Irwin
irwin at beluga.phys.uvic.ca
Fri Jun 30 11:51:03 PDT 2006
On 2006-06-30 09:26-0700 Adam Parkin wrote:
> Are you telling me you can see a huge difference in the runtimes of the C
> version versus the Perl/Python version?
>
> I'm not disagreeing with you that Perl/Python has a cost associated with it,
> what I'm saying is that in an awful lot of problem domains the difference is
> so minor (particularly on modern machines) that it becomes difficult to see
> why one would choose the "fast" compiled language rather than the "slow"
> interpreted one.
>
> Sure you can always come up with the specific case where program X written in
> C runs 50% faster than program Y written in Perl, but in the general case
> it's often hard to see the performance benefits of C over other languages,
> but easy to see the maintenance, safety, and SENG benefits of other languages
> over C.
> [...]First thing: please point me in the direction of some hard data that shows
> that C is so much faster than Perl/Python...
We have had this discussion of the efficiency of compiled (C) versus
interpreted (Perl/Python) language efficiency before. Typically for
numerical work with arrays, Perl/Python is 100 times (yes, that large factor
is not a misprint) slower. Any trivial test involving numerical arrays
should prove the point for you. In fact, the whole point of the
Python/Numeric and Perl/PDL projects is to address this deficiency in their
respective base languages. They both do the array programming in efficient
C code, and define an API so that Perl/PDL or Python/Numeric can use that C
code. However, these fixups only deal with some of the efficiency issues
with Perl/Python, and in general you can expect those interpreted languages
to be much slower than compiled languages unless there is an API for doing
exactly what you want (such as splitting strings).
The high efficiency of C is a given, but I also agree with the maintenance,
safety, etc. points about C, and I suggest (as recently demonstrated by the
string-splitting exercise) that there is often a better language option than
C for any given problem.
To give another example, most of my own computer needs are for numerical
work, and I have recently concluded that the higher level languages of
choice in that case is again fortran. I have recently had the pleasure of
doing some fortran 95 coding (now that free fortran 95 compilers such as
gfortran and g95 are available). That language is just as high level as
python/Numeric so, for example, you can numerically process all arrays
without any specific do loops being required. That is a huge advantage in
the scientific programming context where most of the numerical work involves
array processing. Also, fortran has the same numerical efficiency as C by
definition (the gfortran and f95 compilers share the GNU compiler collection
back end with the gcc compiler.)
The one question mark with fortran libraries is the best way to interface
them with code written in other languages. For some of my own fortran
libraries I am considering interfacing them to C using the cfortran.h
approach, and from that C API to the rest of a large set of languages
(including python, java, perl, ruby, and ocamel, see
http://www.swig.org/compat.html#SupportedLanguages) using swig. For C and
C++ libraries, you can just eliminate the cfortran.h step and interface them
directly using swig (which is how, for example, the PLplot team creates the
python and java interfaces to the C PLplot library).
This continues to be a most interesting thread. Thanks, Peter, for starting
it.
Alan
__________________________
Alan W. Irwin
Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).
Programming affiliations with the FreeEOS equation-of-state implementation
for stellar interiors (freeeos.sf.net); PLplot scientific plotting software
package (plplot.org); the Yorick front-end to PLplot (yplot.sf.net); the
Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project
(lbproject.sf.net).
__________________________
Linux-powered Science
__________________________
More information about the Discuss
mailing list