[Discuss] A question for the Python gurus

Adam Parkin pzelnip at telus.net
Sat Jul 29 01:21:22 PDT 2006


Brian Quinlan wrote:
> Rewrite 1
> ---------
> 
> while 1:
>     line = sys.stdin.readline()
>     if not line:
>         break
> 
>     # now do something with line

Which was the original version I was complaining about. =8-p

> Rewrite 2
> ---------
> 
> for line in sys.stdin:
>     # now do something with line

I actually tried this, but found that there were two problems:

1) the body of the loop didn't seem to execute until after EOF was found
2) for some reason I had to hit EOF twice to get the loop to terminate

The exact code I wrote was:

#!/usr/bin/python

import sys

for line in sys.stdin:
         print "you entered '" + line + "'"

But when I do this, I don't see the "you entered..." until after I hit 
eof, and then I get prompted again for more input (the program doesn't 
terminate until I hit EOF twice in a row).

> Right, because this is a common source of errors i.e. people writing 
> assignment when they meant comparison

Y'know I've never been caught on that one, it always suprises me how 
often people get tripped up with it.  Valid point all the same though.

>> So my thinking was to write a function that would return true/false 
>> depending on whether or not EOF was reached, and to modify its only 
>> argument with whatever was read from stdin.
> 
> This is a bad idea. Just use normal Python idioms.

Hmm, care to elaborate why it's a "bad idea" (not efficient, confusing 
to other programmers, etc)?  As it turns out it's also impossible, but 
that's a side issue. =8-p

>> One thing I don't understand though, why are strings in Python 
>> immutable?  
> 
> For a few reasons:
> 1. they can only act as hash keys if they are immutable

Okay, that's a good one.

> 2. since Python does pass-by-object, changing a string in one part of
>    your program could unintentionally affect another part

Well, sure, referential transparency and no side effects are good things 
(tm), but there's a difference between allowing the programmer to decide 
which is better for his/her situation rather than the language forcing a 
particular way of doing things upon him/her.  If you're going to take 
the functional programming approach and say "side effects are bad" then 
you should be making deep copies of all objects passed to functions, as 
you could just as easily argue that an internal member of an object 
passed to a function may be accidentally modified inside the function 
(particularly since there's no notion of "private" data members in 
Python) thereby unintentionally affecting another unrelated part of the 
program.  Of course then you'd have some huge performance issues, which 
is why most languages don't do that.  Even alot of functional languages 
still give you references (SML for example), and this is often necessary 
for performance reasons.

>> That seems to be a fairly significant limitation from an efficiency 
>> standpoint as it means any kind of string processing involves creating 
>> new strings repeatedly instead of just modifying in place. 
> 
> Actually, efficiency is another reason why strings are immutable:
> 1. the hash of a string need only every be calculated once (and cached)
> 2. identical strings can be (and are) pooled

Fair enough.  Now if identical strings are pooled, would that help with 
the following example:

while 1:
	# read a line in, string #1
	line = sys.stdin.readline()
	if not line:
		break
	
	# remove CR from end, string #2
	line = line.rstrip('\n')

	# Say capitalize the start of the string, string #3
	line = line.capitalize()

Or would you have to allocate memory & copy three strings each time 
through the loop?  I could see if a string is a substring of another 
(the case of removing the \n) then pooling would help (the 
implementation of the string class could just have a "start of string" 
and "end of string" pointer, so you'd just be moving pointers around), 
but in the case of changing part of the string (capitalizing) I would 
think it wouldn't help at all.

Now imagine if you say, want to capitalize every word in the line, then 
the memory/time requirements would explode, would they not?  If there 
were n words in the sentance, then you'd have to create n copies of the 
string (or so it would seem to me -- correct me if I'm wrong).

> I think that you mentioned before that you are a Perl programmer? Time 
> to start getting used to immutable strings for Perl 6 :-)

Actually, I'm not (a Perl programmer).  I'm more familiar with Perl than 
with Python, but I am by no means a Perl guru by any stretch of the 
imagination. =8-p
--
Adam Parkin
E-mail: pzelnip at telus.net
----------------------
A common mistake people make when trying to design something completely 
foolproof is to underestimate the ingenuity of complete fools.

	--Douglas Adams, http://www.kettering.edu/~jhuggins/humor/quotes.html


More information about the Discuss mailing list