mertz final index€¦ · mertz_final_index.fm page 489 monday, may 5, 2003 9:26 am. 490 i ndex...

36
485 INDEX A abs() function, 21, 55 .__abs__() method, 21, 431 Absolute file positioning, 156 Absolute path, 66 Absolute value, 21 Abstract state machine class, 273–274 .__add__() method, 13, 431 Adder factory, 4–5 Adler-32 checksum, See Checksum AE module, 100 aepack module, 100 aetypes module, 100 AIFC audio files, 104 aifc module, 104 AIFF audio files, 104 AL module, 104 al module, 104 Alphanumeric character class (\w), 239 Alternation operator, SimpleParse (/), 324–325 Alternation operator, regex (|), 208, 240 Ambiguous arithmetic expression, 340 amkCrypto module, 165 and operator, 434 anydbm module, 90–92 .append() function, 32 AppleEvents, 100–101 AppleSingle format files, 100 applesingle module, 100 Applications, 102 apply() function, 447, 450–451 approx() function, 12 Archived files, 178–180 array module, 105, 130 Arrays, 105 Art of Computer Programming, The, Third Edition (Knuth), 20 ASCII (American Standard Code for Information Interchange), 465–466 characters, 312 converting data to binary, 158–163 regexes for markup, 318 transmitting binary data as, 121–123 ascii encoding, 186 Asymmetrical encryption, 164, 481 Asynchronous event handlers, 108 Asynchronous I/O on sockets, 397 Asynchronous socket service clients and servers, 395 asyncore module, 395 atexit module, 105 Atoms, 207, 209 Attributes mx.TextTool module, 308–309 regular expressions, 249–255 tokens, 335 Audio data, 104, 347 Audio hardware under Windows interface, 104 audiodata argument, 347 audioop module, 104 Authentication, 391 awk, 204 Aycock, John, 328 B Backreferences, 210 complex pattern, 218–219 naming, 211 operator, 238 replacement, 214 in replacement patterns, 218 Back-tick operator, 15 Base classes for datatypes, 13–34 base64 encoding, 158, 187 base64 module, 122, 158–159 mertz_final_index.fm Page 485 Monday, May 5, 2003 9:26 AM

Upload: others

Post on 18-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

485

INDEX

A

abs() function, 21, 55.__abs__() method, 21, 431Absolute file positioning, 156Absolute path, 66Absolute value, 21Abstract state machine class, 273–274.__add__() method, 13, 431Adder factory, 4–5Adler-32 checksum,

See

ChecksumAE module, 100aepack module, 100aetypes module, 100AIFC audio files, 104aifc module, 104AIFF audio files, 104AL module, 104al module, 104Alphanumeric character class (\w), 239Alternation operator, SimpleParse (/), 324–325Alternation operator, regex (|), 208, 240Ambiguous arithmetic expression, 340amkCrypto module, 165and operator, 434anydbm module, 90–92.append() function, 32AppleEvents, 100–101AppleSingle format files, 100applesingle module, 100Applications, 102apply() function, 447, 450–451approx() function, 12Archived files, 178–180array module, 105, 130Arrays, 105

Art of Computer Programming, The,

Third Edition (Knuth), 20

ASCII (American Standard Code for Information Interchange), 465–466

characters, 312converting data to binary, 158–163regexes for markup, 318transmitting binary data as, 121–123

ascii encoding, 186Asymmetrical encryption, 164, 481Asynchronous event handlers, 108Asynchronous I/O on sockets, 397Asynchronous socket service clients and servers,

395asyncore module, 395atexit module, 105Atoms, 207, 209Attributes

mx.TextTool module, 308–309regular expressions, 249–255tokens, 335

Audio data, 104, 347Audio hardware under Windows interface, 104audiodata argument, 347audioop module, 104Authentication, 391awk, 204Aycock, John, 328

B

Backreferences, 210complex pattern, 218–219naming, 211operator, 238replacement, 214in replacement patterns, 218

Back-tick operator, 15Base classes for datatypes, 13–34base64 encoding, 158, 187base64 module, 122, 158–159

mertz_final_index.fm Page 485 Monday, May 5, 2003 9:26 AM

Page 2: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

486 I

NDEX

base64 module,

continued

decode() function, 158decodestring() function, 158encode() function, 158encoded strings, 159encodestring() function, 158line blocks, 158–159

BaseHTTPServer module, 105Basic Macintosh dialogs, 101Basic string transformations, 128–147Bastion module, 105Beazley, David, 328Beginning of line operator (^), 239Beginning of string (\A), 239Bemers, Bob, 465Benchmarks, 87, 287–296Berkeley DB library interface, 92BigGraph module, 281Big-O notation, 481–482binary minus (-) operator, 22Binary bit operations, 18Binary data

converting to ASCII, 158–163transmitting as ASCII, 121–123

Binary files,

See

Binary string Binary string

based64 encoding, 159binhex4 encoding, 160hexadecimal encoding, 159packed, 84–86quoted printable encoding, 160UUencoding, 160

binascii module, 122, 159–161a2b_base64() function, 158, 159a2b_hex() function, 159a2b_hqx() function, 159a2b.qp() function, 159a2b_qp() function, 162a2b_uu() function, 159, 163b2a_base64() function, 158, 159b2a_hex() function, 159b2a_hqx() function, 160b2a_qp() function, 160, 162b2a_uu() function, 160, 163crc32() function, 160crc_hqx() function, 160Error exception, 161hexlify() function, 160Incomplete exception, 161rlecode_hqx() function, 161rledecode_hqx() function, 161unhexlify() function, 161

Binding names to objects, 42Bindings, 418–421

Binhex, 122binhex4 checksum,

See

Checksumbinhex module, 122, 161–162

binhex.binhex() function, 161Binhex-encoded string, 159binhex.hexbin() function, 161binhex4 RLE (run-length encoding), 161

Birthday paradox, 41, 482bisect module, 105Bit-position encoded character set, 310Bit-shifting, 19Bitwise inversion, 19Bitwise-and, 19Bitwise-or, 19Bitwise-xor, 19Block special device files, 70Block-level state machine, 292book2html.py file, 474–479Boolean comparisons, 14Boolean datatype, 421Boolean shortcutting, 434Boolean value, 26Bounded numeric quantifier, regex, 242Browsers, remote-control interfaces, 398BSD DB library interface, 92BSD sockets, low-level interface, 397bsddb module, 92BSD-style mailbox, 373buffer() function, 55buildtools module, 100__builtin__() function, 89Built-in functions, 55Buyer states chart, 279buyer_invoices.py file, 276–278Buyer/order report, parsing, 287–289buyer_report.py file, 288–289, 291buyers tag table states, 289–290Byte-codes, generating, 106bzip (BZ2), 172

C

C extensions, 55Caches

checking for file modification, 65clearing, 64directory listings, 57–58lines from files, 64–65reading lines from, 64

cal utility, 100Calendars, 100–101Canonicalization, 413Capabilities, testing, 10–11Capability-based polymorphism, 10

mertz_final_index.fm Page 486 Monday, May 5, 2003 9:26 AM

Page 3: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

487

"".capitalize() string method, 132–133Carbon API interface, 101Carbon.* modules, 101cargo variable, 274cd module, 101"".center() string method, 133cfmfile module, 101CGI applications, debugging, 382–383CGI (Common Gateway Interface), 376–383

GET request, 376–377POST request, 376

cgi module, 376–383CGI scripts, 376

calling from another Web page, 377traceback, 382–383Web bugs, 378

cgi.FieldStorage class, 379–380.file attribute, 381.filename attribute, 381.getfirst() method, 380.getlist() method, 380.getvalue() method, 380–381.list attribute, 381.value attribute, 381

cgi.MiniFieldStorage, 381CGIHTTPServer module, 105cgitb module, 382–383

.enable() method, 382–383Character classes, 208, 238, 323Character references, 387Character set

bit-position encoded, 310encodings, 395

Character special device files, 70Characters, 465–466

counting, 120–121lists split around, 311printable, 132

check_imap_subjects.py file, 366–367Checking for server errors, 224–226Checksum

Addler-32, 182binhex4 checksum, 160CRC32 checksum, 160, 182CRC32 (cyclic redundancy check), 482CRC32 hash, 196SHA cryptographic hash, 170

Child classes comparisons, 14chmod utility, 76chroot utility, 75chunk module, 104CJK (Chinese-Japanese-Korean) alphabets, 185class instances, 430–432Classes

definitions, 419–421inheritance, 11new-style, 11–14representing email messages, 355–362specializing, 11

Cleanup actions, 52Cloning message objects, 350.close() method of files, 16closed attribute, 16ClosureDict class, 36–37cmath module, 105cmd module, 105cmp() function, 25, 113.__cmp__() method, 29Code Fragment Resource module, 101code module, 105, 445Code objects types, 55codecs module, 186, 189–190

EncodedFile() function, 190open() function, 189

codeop module, 106, 445Codepages, 466col utility, 223Collections

number of items in, 97types, 14

colorize utility module, 472–474ColorPicker module, 101COLORSCHEME state, 269colorsys module, 104Column statistics for delimited or flat-record files,

117–120Combinatorial higher-order functions, 5–7Command line

parsing options, 44–47piped and redirected streams, 51summarizing documentation, 221–223

Commandsmx.TextTools module, 299–300quick access to external, 73

commands module, 73, 74getoutput() function, 73getstatus() function, 73getstatusoutput() function, 73

Commentsarchived file, 180HTML, 387in verbose regular expressions, 220regular expression pseudo-group, 242XML documents, 401

commify() function, 230-231Communications Tool Box interface, 101Comparing

custom classes, 14

mertz_final_index.fm Page 487 Monday, May 5, 2003 9:26 AM

Page 4: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

488 I

NDEX

Comparing,

continued

directories, 58–61files, 58–61floating-point numbers, 21

Comparison operators, 14comp.compression FAQ, 464compile() function, 55compile module, 106compileall module, 106compile.ast module, 106Compiled applications, 103

Compilers: Principle, Techniques and Tools,

(Aho, et al.), 257–258

compile.visitor module, 106comp.lang.python newsgroup, xiv, 194complex datatype, 422complex() function, 10Complex numbers, 20, 22–24, 105

.conjugate() method, 23

.imag() method, 23

.__ge__() method, 22

.__gt__() method, 22

.__le__() method, 22

.__lt__() method, 22.real() method, 24Complex pattern backreferences, 218–219complex_file_operation() function, 443Compound address, parsing, 365Compound data, 90Compound types, 430–432compress (.Z), 172–173Compressing.

See

Data compressionCompression object, 183–184Concatenating strings to sha object, 172Concrete state machine, 274–280ConfigParser module, 282–283Configuration files, 221–223Constants

Font Manager library (IRIX), 101FORMS library (IRIX), 101interpreting results of os.statvfs() and

os.fstatvfs(), 108mx.TextTools module, 298–299regular expressions, 244–245Silicon Graphics' Graphics Library, 104trigonometric and algebraic, 107

Container types, 427–430ContentHandler class, 406ContentHandler handler, 405CONTENT_LENGTH environment variable, 378Content-Type header, 361Continuation characters, 226–227continuation_ws argument, 351Cookie module, 395

Cookies, managing, 395copy module, 43–44

copy() function, 43deepcopy() function, 43–44

Copying, 42deep copy, 43–44dictionaries, 26directory trees, 68–69file-like objects, 68files, 68–69permission bits, 68permissions data, 69sha object, 171shallow copy, 43substring within memory-mapped file object,

150symbolic links, 69timestamp data, 69URLs (Uniform Resource Locators), 392

copy_reg module, 106

Core Python Programming,

(Chun), xv"".count() string method, 134Counting characters, words, lines and paragraphs,

120–121cPickle module, 93–94, 106

dump() function, 93dumps() function, 93–94load() function, 94loads() function, 94

crc32.py utility, 196,

See also

ChecksumCriteria, quickly sorting lines on custom, 112–115crypt module, 166

crypt.crypt() function, 166Cryptography

asymmetrical encryption, 164, 481cryptographic hash, 163–164, 166, 482digital signatures, 164MD5 (Message Digest 5), 167SHA (Secure Hash Algorithm), 170–172strong hash, 196–198symmetrical encryption, 163third-party modules, 165threat model, 196–198

cStringIO file-like object, 122cStringIO module, 153–158

InputType constant, 154–155OutputType constant, 155StringIO class, 155StringIO() function, 459StringIO.close() method, 155StringIO.flush() method, 155StringIO.getvalue() method, 155StringIO.isatty() method, 155StringIO.read() method, 156

mertz_final_index.fm Page 488 Monday, May 5, 2003 9:26 AM

Page 5: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

489

StringIO.readline() method, 156StringIO.readlines() method, 156StringIO.reset() method, 156StringIO.seek() method, 156StringIO.tell() method, 157StringIO.truncate() method, 157StringIO.write() method, 157StringIO.writelines() method, 157–158

ctb module, 101current_section variable, 270curses package, 106

curses.ascii module, 106curses.panel module, 106curses.textpad module, 106curses.wrapper module, 106

Customcomparison function, 113datatypes, 11–13datatypes and magic methods, 13–34dictionary-like objects, 36–37file-like objects, 15–17FTP (file transfer protocol) clients, 395functions, 3–4IMAP clients, 366–368POP3 clients, 368–370processing, xSAX handlers, 406SMTP clients, 370–371sorting algorithm, 113–115Telnet clients, 397text compressor, 459–464text processing tasks, xWeb clients, 396

Customizable startup module, 108Customizing string representation of objects,

96–98Cyclic garbarge collection, 106Cyphertext string, 170

D

Data, ix–xaccidental damage to, 196–198as code, 445–446compound, 90deep, 258–260hash of correct, 196special values and formats, 82–89structured, 90

Data compressionbzip (BZ2), 172choosing correct data representation, 458–459compress (.Z), 172–173custom text compressor, 459–464

data set example, 454definition of, 453GZ (gzip), 172–173Huffman encoding, 456–457Lempel-Ziv compression, 457–458Lossless versus lossy, 454references, 464RLE (run-length encoding), 455–456SIT, 173whitespace compression, 455word-based Huffman compressed text,

460–461ZIP format, 172–173zlib library, 181–185

Data fork, 161Data set example, 454Data structures, 111Datatypes, 8–9, 54–55, 421

!=, <>, and == operators, 14base classes, 13–34Boolean comparisons, 14buffer() function, 55custom, 11–34emulating, 11equality, 95file-like objects, 11format codes, 424list-like, 28–32more readable, 94–96pretty-printing, 94–96printing, 425–427recursive containers, 95response to == operator, 14simple, 421–423string interpolation, 423–425

Date tuples, 86Dates, manipulating values, 86–89db file, 93dbhash module, 92dbm-style databases, 90–93dbm module, 92*DBM modules, 90–93

DBM.close() method, 91DBM.first() method, 91DBM.has_key() method, 92DBM.keys() method, 92DBM.last() method, 92DBM.next() method, 91–92DBM.open() function, 91DBM.previous() method, 92DBM.sync() method, 92

Debuggingstack traces, 109mx.TextTools tag table, 297–298

mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM

Page 6: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

490 I

NDEX

Decimal numerals, 130, 298declaration patterns, SimpleParse, 321Decoding base64 encoding, 158Decompressing zlib library, 181–185Decompression object, 183, 185Decryption, Enigma-like, 168–169Deep data, 258–260Default Unicode string encoding, 52del statement, 30Delimited files, 117–120Deprecation Warning, 235DES (Data Encryption Standard), 166Detecting duplicate words, 223–224DEVICE module, 104Devices, 70Dict class, 24–27dict() function, 10dict type, 36

dict.clear() method, 26dict.__cmp__() method, 24–25dict.__contains__() method, 25dict.copy() method, 26dict.__delitem__() method, 25dict.get() method, 26dict.__getitem__() method, 25dict.has_key() method, 26dict.items() method, 26–27dict.iteritems() method, 26–27dict.iterkeys() method, 27dict.itervalues() method, 27dict.keys() method, 27dict.__len__() method, 26dict.popitem() method, 27dict.setdefault() method, 27dict.__setitem__() method, 26dict.update() method, 27dict.values() method, 27

Dictionaries, 24–27dicts, 428-429hash collisions, 30mapping group names to group numbers, 249mapping symbolic names to character entities,

383–384named groups, 253

Dictionary objects, 24–27Dictionary-based string interpolation, 35–36Dictionary-like objects

containing current environment, 81custom, 36–37

diff utility, 283difflib module, 283–284

get_close_matches() function, 283ndiff() function, 283restore() function, 283

Digit character class (), 208, 238Digital signatures, 164, 195–196, 482dir() function, 55dircache module, 57–58, 106

annotate() function, 58listdir() function, 58opendir() function, 58

Directories, 70caching listings, 57–58comparing, 58–61comparison report, 59–60filenames, 60identifying, 58information about, 76, 79–80listing, 76numeric mode, 75–77owner and group, 75path permissions, 74–75pathnames, 60, 66reading listings, 57–58removing, 79renaming, 79subdirectories, 60

Directory trees, 68–69dis module, 106dissertation.dtd file, 411–412dissertation.py file, 412

Distributing Python Modules,

(Ward),106distutils module, 106Division, 21divmod() function, 21dl module, 101__doc__ strings, 106doctest module, 106Document collections, 199–202Documentation

script and module for examining, 108summarizing command-line option, 221–223

Documents, finding relevant in collection, 199–200DocUtils package, 471DOM (Document Object Model), 403–404

4DOM, 413implementation, 413implementation that conserves memory, 405lightweight implementation, 405OOP model for working with XML, 410

DTDHandler class, 406DTDHandler handler, 405DTDs (Document Type Definitions), 261, 401–402dumbdbm module, 92Duplicate words, detecting, 223–224dupwords.py file, 223Dynamic Web pages, 34

mertz_final_index.fm Page 490 Monday, May 5, 2003 9:26 AM

Page 7: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

491

E

EasyDialogs module, 101EBCDIC, 465EBNF (Extended Backus-Naur Form) grammars,

258, 261, 262-264, 317EBNF parser library, 286EBNF parsing, high-level, 316–328EBNF-style description of Python floating point,

261eGenix.com Web site, xviElements, DOM, 400Ellipsis object, 55Email

adding string or Unicode string to end, 352–353

BSD mailbox, 361communicating with network servers, 344constructing header object, 354core text processing task, 345describing RFC-2231 string components,

353–354examining contents of message folders,

344–345formatted address, 364frauds, 345list of compound addresses, 364managing headers with non-ASCII values,

351–354manipulating and creating message texts,

345–348multinational strings in header, 351–354RFC-2282-formatted date, 364spam, 345timestamp, 365viruses, 345

Email addresses, 228–229Email clients, storing messages, 344–345Email messages

adding field to header, 357adding payload, 357–358base64 encoding, 349body, 345–346BSD mailbox envelope header, 362classes representing, 355–362content type, 362content typing rules, 356Content-Type header, 361current default type, 362default type, 359encoding body, 349encoding parameter to RFC-2231, 362header, 345–346header fields, 358–359

helper functions, 364–365indexing by key, 356iterating through components, 354MIME content delimiters, 358, 361MIME message boundary delimiter, 359mimification and unmimification, 396multipart, 360–361payload, 360–362pretty-printed representation of structure, 355quoted printable encoding, 349recursively traversing, 362removing parameter from header, 358serializing to RFC-2822-compliant text string,

357string description, 359uniform interface, 372–374

email package, 282, 345–349email.message_from_file() function, 348

email.message_from_string() function, 348email.Charset module, 351, 395email.Encoder module, 349

encode_base64() function, 349encode_7or8bit() function, 349encode_quopri() function, 349

email.Errors module, 349email.Generator module, 350

DecodedGenerator class, 350DecodedGenerator.clone() method, 350DecodedGenerator.flatten() method, 351DecodedGenerator.write() method, 351Generator class, 350Generator.clone() method, 350Generator.flatten() method, 351Generator.write() method, 351

email.Header module, 351–354decode_header() function, 353–354Header class, 351–352Header.append() method, 352–353Header.encode() method, 353Header.__str__() method, 353make_header() function, 354

email.Iterators module, 354–355body_line_iterator() function, 354–355_structure() function, 355typed_subpart_iterator() function, 355

email.Message module, 355–362Message class, 346, 355–357Message object, 389Message.add_header() method, 357Message.as_string() method, 357Message.attach() method, 357–358Message.del_param() method, 358Message.epilogue attribute, 358Message.get_all() method, 358–359

mertz_final_index.fm Page 491 Monday, May 5, 2003 9:26 AM

Page 8: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

492 I

NDEX

email.Message module,

continued

Message.get_boundary() method, 359Message.get_charsets() method, 359Message.get_content_charset() method, 359Message.get_content_maintype() method, 359Message.get_content_subtype() method, 359Message.get_content_type() method, 359Message.get_default_type() method, 359Message.get_filename() method, 359Message.get_param() method, 360Message.get_params() method, 360Message.get_payload() method, 360–361Message.get_unixfrom() method, 361Message.is_multipart() method, 361Message.preamble attribute, 361Message.replace_header() method, 361Message.set_boundary() method, 361Message.set_default_type() method, 362Message.set_param() method, 362Message.set_payload() method, 362Message.set_type() method, 362Message.set_unixfrom() method, 362Message.walk() method, 362

email.MIMEAudio.MIMEAudio, 347–348email.MIMEBase.MIMEBase class, 346email.MIMEImage.MIMEImage class, 348email.MIMEMultipart.MIMEMultipart class, 347email.MIMENonMultipart.MIMENonMultipart

class, 347email.MIMEText.MIMEText class, 348email.Parser module, 363

HeaderParser class, 363HeaderParser.parse() method, 363HeaderParser.parsestr() method, 363Parser class, 363Parser.parse() method, 363Parser.parsestr() method, 363

email.Utils module, 364–365decode_rfc2231() function, 364encode_rfc2231() function, 364formataddr() function, 364getaddresses() function, 364make_msgid() function, 365mktime_tz() function, 365parseaddr() function, 365parsedate() function, 365parsedate_tz() function, 365quote() function, 365unquote() function, 365

Empty characters, 131–132Empty productions, 338"".encode() string method, 186–188encode_binary.py file, 122–123Encryption,

See also

Cryptography

algorithms, 169asymmetrical, 164Enigma-like, 168–169symmetrical, 163

End of line ($), 239End of string (\Z), 239EndLoop exception, 442end_state flag, 281"".endswith() function, 134English letters, 298

lowercase letters, 298numbers and letters, 298uppercase letters, 298

Enhanced objects, 11–13Enigma-like encryption and decryption, 168–169EntityResolver class, 406EntityResolver handler, 405enumerate() function, 447, 449Environment variables, 75, 78EOF command, 296errno module, 106errno system symbols, 106Error messages interpreter, 50–51Error on failure (!), 324Error recovery, 340ErrorHandler class, 406ErrorHandler handler, 405error_page.py file, 225Escape (\)operator, regular expressions, 236Escape-style shortcuts, 208/etc/inetd.conf file, 221–223evals() function, 445Evans, Carey, 165Event scheduler, 108Event-based API, 404Exact numeric quantifier (), 241except statements, 44, 421, 443Exception classes, 44Exceptions, 44, 49

built-in, 89catching, 441–444dynamic scope, 441–442email package, 349exiting gracefully from deeply nested loops,

442flagging circumstances as unusual, 441invalid or disallowed actions, 441raising, 441–444

exceptions module, 44, 443Excessive call nesting, 4exec statement, 445–446Execution, restricted facilities, 108Existential quantifier (+), regular expressions, 240Existential qunatifier (+), SimpleParse, 323

mertz_final_index.fm Page 492 Monday, May 5, 2003 9:26 AM

Page 9: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

493

Exit handlers, 105Exiting Python, 52"".expandtabs() string method, 134–135expat nonvalidating XML parser interface, 405exponentfloat alternative, 262Exponentiation, 22Expressions, 447–448Extended call syntax, 450–451Extended regular expressions, 209–210External commands

opening pipe to or from, 77opening STDIN, STDOUT, and STDERR

pipes, 78opening STDIN and STDOUT pipes, 77quick access to, 73

extract_email() function, 229Extracting content from fillers, 194–195extract_urls() function, 229

F

Fallback conditions, 295"Fast and Flexible Word Searching on Compressed

Text," 464Fast text manipulation tools, 286–316FastCGI, 376fcntl module, 101fcntl() system function, 101fcrypt module, 165Fermat triples, 442Field names, email headers, 346fields dictionary, 35fields_stats.py file, 117–120FieldStats class, 120FIFO (named pipe), 70file class, 15–17File descriptors, 74File extensions, 374–376File Find, xFile objects, 37, 74

closing, 16invisible, 80memory-mapped, 147–153STDERR (standard error stream), 50–51STDIN (standard input stream), 51STDOUT (standard output stream), 51temporary, 71update mode, 80

File system services, 102filecmp module, 58–61

cmp() function, 58cmpfiles() function, 59dircmp class, 59dircmp.common attribute, 60

dircmp.common_dirs attribute, 60dircmp.common_files attribute, 60dircmp.common_funny attribute, 60dircmp.diff_files attribute, 60dircmp.funny_files attribute, 60dircmp.left_list attribute, 60dircmp.left_only attribute, 60dircmp.report() method, 59–60dircmp.report_partial_closure() method, 60dircmp.right_list attribute, 60dircmp.right_only attribute, 60dircmp.same_files attribute, 60dircmp.subdirs attribute, 61

fileinput module, 61–63close() function, 62FileInput class, 63filelineno() function, 63filename() function, 63input() function, 62isfirstline() function, 63isstdin() function, 63lineno() function, 63nextfile() function, 63

File-like interface, 71File-like objects, 9–11, 11, 68

connecting to URL (Uniform Resource Loca-tor), 389

copying, 68custom, 15–17generator instance writing to, 350message text contained in, 348reading and writing strings, 15reading and writing to string buffer, 153–158serialized objects, 94writing serialized form of object, 93writing string to, 351

Filenamesmatching patterns against, 232–234temporary, 71

Filesappending, 16cache of os.stat() information, 108caching lines from, 64–65closing, 16, 63comparing, 58–61copying, 68–69file descriptor number, 16file handle, 381.fileno() method, 16, 148, 389finding random lines, 39–41group, 75identifying, 58information about, 74, 76, 79–80last access time, 70

mertz_final_index.fm Page 493 Monday, May 5, 2003 9:26 AM

Page 10: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

494 I

NDEX

Files,

continued

last status change, 71line-oriented, 39–41lines from large, 37–38listing, 76mode, 16modification time, 71name of, 16, 63, 381number of lines read, 63numeric mode, 75opening next, 63owner, 70, 75path permissions, 74–75persistence, 147position, 17positioning, 156reading, 15–16

backwards, 126–128lines from, 2, 38–39multiple, 61–63over lines, 72

.readline(), 120

.readline() method, 126

.readlines() method, 120, 126removing, 78, 81renaming, 79run-length encoding, 161setting access and modification timestamps,

81shallow comparison, 58simulating random access, 64–65size of, 70as strings, 147–158strings delimiting lines, 82temporary, 71testing, 69–71truncating, 17TTY-like device, 16UUdecode, 163UUencode, 163writing to, 16.xreadlines() method, 126

Filesystems, 65–68filter() function, 4–6, 435–438, 447, 450Filters, 3–7finally clauses, 52"".find() string method, 135findertools module, 101findfile1.py file, 199–200findfile2.py file, 200–201Finding

first match, 312random lines in files, 39–41

find_urls.py file, 228–229

First-order functions, 4first_things() function, 125FL module, 101fl module, 101Flags, 149Flat-record files column statistics, 117–120float class, 21–22float() function, 10float datatype, 19, 422

float.__abs__() method, 21float.__add__() method, 21float.__cmp__() method, 21float.__div__() method, 21float.__divmod__() method, 21float.__floordiv__() method, 21float.__mod__() method, 21–22float.__mul__() method, 22float.__neg__() method, 22float.__pow__() method, 22float.__radd__() method, 21float.__rdiv__() method, 21float.__rdivmod__() method, 21float.__rfloordiv__() method, 21float.__rmod__() method, 21–22float.__rmul__() method, 22float.__rpow__() method, 22float.__rsub__() method, 22float.__rtruediv__() method, 22float.__sub__() method, 22float.__truediv__() method, 22

Floating point numbers, 19with beta distribution, 82circular uniform distribution, 83comparing, 21converting string to, 132defining, 262–263division, 21–22exception control (Unix), 102exponential distribution, 83exponentiation, 22floor division operator //, 21formatting functions, 106gamma distribution, 83Gaussian distribution, 83log normal distribution, 83math, 20modulo division, 22multiplication, 22negative, 22Pareto distribution, 83random, 84ratio, 21summing, 21von Mises distribution, 84

mertz_final_index.fm Page 494 Monday, May 5, 2003 9:26 AM

Page 11: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

495

Weibull distribution, 84Floor division operator (//), 21Flow control, 432

Boolean shortcutting, 434filter() function, 435–438for/continue/break statements, 434–435functions, 439–441if/then/else statements, 433–434list comprehensions, 435–438List-application functions as, 450map() function, 435–438reduce() function, 435–438simple generators, 439–441while/else/continue/break statements,

438–439yield statement, 439–441

flp module, 101Flush left text block, 220–221

flush_left() function, 221flush() method, 16fnmatch module, 64, 232–234

filter() function, 233–234fnmatch() function, 233fnmatchcase() function, 233

Font Manager library (IRIX), 101for statements, 26, 420for/continue/break statements, 434–435Format codes datatypes, 424Format string, 84formatter module, 284–285

AbstractWriter class, 115DumbWriter class, 115

Formatting events, 284Formfeed character, 299form_letter template, 35Forms

automating processing, 379–380filling out, 34–37

FORMS library (IRIX), 101Fourthought company Web site, 408Functional programming (FP), 271, 446–447

concepts, 1–2expressions, 447–448extended call syntax, 450–451functions, 447lamda operator, 447–448list-application functions as flow control, 450obfuscated Python code, 271rebinding names, 447side effects, 447solutions expressed in terms of what, 447special list functions, 448–449

fpectl module, 102fpformat module, 106

frame objects types, 56FrameWork module, 102Freshmeat Web site, xvFTP (File Transfer Protocol) clients, 395ftplib module, 343, 395funcs tag table, 296Function factories, 4–5Function objects, 4Function-defined patterns, 336Functions, 439–441

ad hoc overloading, 53built-in, 89built-in Unicode, 186–188custom, 3–4definitions, 419–421as first-class objects, 447first-order, 4higher-order, 1–7lambda operator, 419mx.TextTools modules, 310–311naive argument overloading, 53–54"quacks like a duck" overloading of argument,

54referentially transparent, 447regular expressions, 245–248signature-based, 8special list, 448–449standard operations as, 47–48string module, 129trigonometric and algebraic, 107with two parameters as argument, 437type checks on arguments, 8

Fuzzy concepts, 12–13, 15Fuzzy tagstack, 386

G

Garbage collection, 109gc module, 106gdbm module, 93generator function, 439generator iterator, 439generator_iter object, 440Generator-iterator objects, 56German letters, 298.__getinitargs__() method and the pickle module,

93.__getitem__() object method, 63, 431getopt module, 44–47

getopt.getopt() function, 46–47getopt.GetoptError exception, 46

getpass module, 106.__getstate__() method and the pickle module, 93gettext module, 102

mertz_final_index.fm Page 495 Monday, May 5, 2003 9:26 AM

Page 12: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

496 I

NDEX

GL module, 104gl module, 104glob module, 64

glob.glob() function, 64Glob-style pattern, 64Glob-style subpatterns file, 233

Gnosis Utilities,

xvi, 408gnosis.indexer, xvignosis.xml.indexer module, xvi, 409gnosis.xml.objectify module, 409–410gnosis.xml.pickle module, xvi, 94, 410–411gnosis.xml.validity module, 411–412

Gnosis Web site, xiv, 447GNU readline interface, 108Google, xvi, 391Googol, 263Googolplex, 263Gopher protocol client interface, 395gopherlib module, 343, 395Grammar rules, 337–339Grammars, 260–265Greenwich Mean Time, 88grep, x, 204, 207, 213Grouping operators

regular expressions, 237-238SimpleParse, 326

Grouping regular expressions, 207Group-like patterns, 242–244grp module, 102Guttman-Rosler Transform, 113GZ (gzip), 172, 173gzip file-like object, 174gzip module, 173–175

gzip.close() method, 174gzip.flush() method, 174gzip.GzipFile class, 174, 459gzip.isatty() method, 174gzip.myfileobj attribute, 174gzip.open class, 174gzip.read() method, 174gzip.readline() method, 175gzip.readlines() method, 175gzip.write() method, 175gzip.writelines() method, 175

gzipped files, 173–175gzip object, 174–175

H

Handlersasynchronous events, 108states, 274, 279

hash() function, 30, 99Hashes,

see

Checksum

hash_rotor.py file, 197–198HeaderParserError exception, 361Headers, email, 351–354Helper functions, 364–365hex() function, 19Hex string, 19Hexadecimal numerals, 130Hexadecimal-encoded string, 159Higher-order functions, combinatorial, 5–7High-level EBNF parsing, 316–328High-level programmatic parsing, 328–341histogram.py file, 124–125Histograms, 123–126HLS color space, 104HOFs (higher-order functions), 1–7HSV color space, 104HTML, 344

character entity references, 383–384comments, 387content data, 387declarations, 387endtag, 387entity reference, 387last tag encountered, 388messages, 343parsers, 384–388PI (processing instruction), 388restoring instance to initial state, 388sending additional data to parser, 387templating system for delivery, 398text, 387

HTML documents, 383–388event-based framework for processing,

384–388parsing, 285processing, 285rating error probability, 225Unicode, 469whitespace compression, 455

htmlentitydefs module, 383–384htmllib module, 284, 285, 384HTMLParser module, 282, 384–388HTMLParser.HTMLParser class, 385–386

.close() method, 386

.feed() method, 387

.getpos() method, 387.handle_charref() method, 387

.handle_comment() method, 387

.handle_data() method, 387

.handle_decl() method, 387

.handle_endtag() method, 387

.handle_entityref() method, 387–388

.handle_pi() method, 388

.handle_startendtag() method, 388

mertz_final_index.fm Page 496 Monday, May 5, 2003 9:26 AM

Page 13: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

497

.handle_starttag() method, 388

.lasttag attribute, 388

.reset() method, 388HTMLParser_stack.py file, 385HTTP, 105, 121, 343–344httplib module, 343, 396HTTP_REFERER environment variable, 378HTTP_USER_AGENT environment variable, 378Huffman encoding, 456–457hypontenuse() function, 448hypothetical.ini file, 269

I

ic module, 396icopen module, 396Idempotent functions, 483IFF audio data, 104if/then/else statements, 433–434IF/THEN/END grammar, 265–267IF/THEN/END structures, 263–264ignore Unicode encoding, 188ignore token, 336Illegal characters, 336imageop module, 104IMAP4, 366–368IMAP clients, custom, 366–368IMAP instance object, 367IMAP server, 367imaplib module, 343–345, 366–368, 370imaplib.IMAP4 class, 367

.close() method, 367

.expunge() method, 367

.fetch() method, 367

.list() method, 367

.login() method, 367

.logout() method, 367

.search() method, 368

.select() method, 368imgfile module, 104imghdr module, 396imglib files, 104Immutable, 427, 483imp module, 107, 445__import__() function, 446import statements, 107, 420Importing packages and modules, 420in operator, 25, 30, 33–34"".index()string method, 135–136Indexed assignment, 31indexer.py utility, 201

*Indexer.fileids mapping, 201*Indexer.files mapping, 201*Indexer.words mapping, 201

IndexError exception, 30Info-Zip, 176Inheritance, 11.ini file, 269–271.__init__() method, 431I-node number, 70I-node protection mode, 70Input, redirected, 61–63input() function, 446Input sequence, 62Input string, parsing, 339inspect module, 107

Installing Python Modules,

(Ward), 106INSTALL.LOG file, 2–3int datatype, 12-13, 421–422

int.__and__() method, 18–19int.__hex__() method, 19int.__invert__() method, 19int.__lshift__() method, 19int.__oct__() method, 19int.__or__() method, 19int.__rand__() method, 18–19int.__rlshift__() method, 19int.__rxor__() method, 19int.__ror__() method, 19int.__rrshift__() method, 19int.__rshift__() method, 19int.__xor__() method, 19

int() function, 10, 11int objects, 18–19Integers, 18–19

bitwise operations, 18as fuzzy concepts, 12–13types, 50values, 132

Interactive shell prompts, 49Interfaces

audio hardware under Windows, 104Berkeley DB library, 92BSD DB library, 92Carbon API, 101Communications Tool Box, 101dbm-style databases, 90–93expat nonvalidating XML parser, 405GDBM (GNU DBM) library, 93GNU readline, 108MH mailboxes, 396Navigation Services, 103parser, 285Python DBM, 92Speech Manager, 102standard color selection dialog, 101Sun audio hardware, 105tokenizer, 285

mertz_final_index.fm Page 497 Monday, May 5, 2003 9:26 AM

Page 14: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

498 I

NDEX

Interfaces,

continued

Unix (n) dmb library, 92Unix syslog library, 103urllib objects, 388WorldScript-Aware Styled Text Engine, 104

Intermediate Cryptology: Specialized Protocols

Web site, 164

Internetaccess configuration, 396accessing resources, 388–394Config replacement for open(), 396modules, 394–399text formats, 344

Internet protocols, 343Interpreted and/or scripting language, 418Interpreter

cleanup, 49copyright information, 49emulating, 105error messages, 50–51five components of version number, 52information about, 49–53string identifying operating system, 82version information, 51–52version number, 50warnings, 50–51

Introduction to Cryptology Concepts I

Web site, 164

Introduction to Cryptology Concepts II

Web site, 164

Introspection, 328IntType datatype, 18–19Invalid regular expressions, 255I/O

completion, 397low-level, 74

iocntl() system function, 101is not operator, 14"".isalnum() string method, 136"".isalpha() string method, 136isatty() file method, 16isCond() function, 2"".isdigit() string method, 136"".islower() string method, 136.is_multipart()method of email.Message, 356ISO-8859-1 character set, 383iso-8859-1 encoding, 187ISO-8859* encodings, 466isRegDBVal() function, 4isShortRegVal() function, 4"".isspace() string method, 136.issubset() method of sets, 429.issuperset() method of sets, 429"".istitle() string method, 136

"".isupper() string method, 136item tag table, 290items() function, 91.items() dictionary method, 91, 355, 356Iterator wrapper, 336–337

J

"".join() string method, 120-121, 130, 137JPEG files, 104jpeg module, 104jump_count callback function, 297jump_no_match condition, 299–300

K

Kasner, Edward, 263Key/value pairs, storing, 90KeyError exception, 26, 27, 178, 361.keys() dictionary method 355, 356, 380keyword module, 107keyword yield, 128Keywords, 263Knuth, Donald, 20, 232Kuchling, Andrew, 165

L

lambda operator, 113, 419, 447–448latin-1 encoding, 187Leadout eater, 296

Learning Python,

(Lutz & Ascher), xvLemburg, Marc-Andre, 165, 286Lempel-Ziv compression, 457–458len() function, 14, 26, 31, 55, 355LEX, 335–337

Lex & Yacc,

258lex module, 336Lexer, 261, 336–337Lexical anlayzer class, 286LexToken, 329, 336lex.token() function, 336Line break characters, 299Line matching, 283line variable, 2linecache module, 38–39, 64–65

checkcache() function, 65clearcache() function, 64getline() function, 38

Line-ending combinations, 315.lineno attribute, PLY, 335Line-oriented command interpreters, 105Line-oriented files, 39–41Lines

counting, 120–121

mertz_final_index.fm Page 498 Monday, May 5, 2003 9:26 AM

Page 15: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

499

listing, 315number in platform-portable way, 311reading files backwards by, 126–128

list datatype, 28–32list.__add__() method, 29–30list.append() method, 32list.__contains__() method, 30list.count() method, 32list.__delitem__() method, 30list.__delslice__() method, 30list.extend() method, 32list.__getitem__() method, 30list.__getslice__() method, 30list.__hash__() method, 30–31list.__iadd__() method, 29–30list.__imul__() method, 31list.index() method, 32list.__len__() method, 31list.__mul__() method, 31list.pop() method, 32list.remove() method, 32list.__rmul__() method, 31list.__setitem__() method, 31list.__setslice__() method, 31list.sort() method, 32

List comprehensions, 435–438list() function, 10, 11, 428List-application functions as flow control, 450list_capwords.py file, 137List-like datatypes, 28–32Lists, 28, 427–428

adding elements, 29–30, 32appending, 32assignment to slice, 31built-in, 130collection of substrings, 130containing value, 30counting number of occurrences in, 32decreasing size, 32extending, 32hash values, 30–31indexed assignment, 31indexing, 30length of sequence, 31new sequence object, 31offset index, 32removing item, 30removing last item, 32reversing, 32satisfying condition given by function argu-

ment, 436–437slice parameter, 30sorting, 32, 112–113split around character, 311

transformed items, 435–436writing to string buffer, 157–158

Literal strings, 323, 423"".ljust() string method, 138Local time, 88locale module, 102LOCALHOST addresses, 229Locating patterns, xlocation_parse.py file, 393Log files, 2–3logical_lines.py file, 227long datatype, 422long() function, 10Long integer digits to stringify, 98long objects, 18–19LongType datatype, 18–19Lookahead assertions, regular expressions, 219Lookahead quantifier, SimpleParse, 324Lookbehind assertions, 219–220Lossless data compression, 454Lossy data compression, 454"".lower() string method, 138Lower-bound quantifier, 241Lowercase letters, 131Low-level I/O, 74Low-level state machine parsing, 286–316"".lstrip() string method, 139

M

mac module, 74, 102macerrors module, 102macfs module, 102macfsn module, 102MacOS

data fork, 161implementation of functionality, 102resource fork, 161structured development of applications, 102

MacOS module, 102MacOS Python interpreter, 102macostools module, 102macpath module, 102macresource module, 102macspeech module, 102mactty module, 102Magic methods, 11–34, 431Mail servers, communicating with, 366–372mailbox class, 372mailbox module, 282, 345, 372–374

mailbox.BabyLMailbox class, 373mailbox.Maildir class, 374mailbox.MHMailbox class, 373mailbox.MmdfMailbox class, 373

mertz_final_index.fm Page 499 Monday, May 5, 2003 9:26 AM

Page 16: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

500 I

NDEX

mailbox module,

continued

mailbox.PortableUnixMailbox class, 373mailbox.UnixMailbox class, 373

Mailboxes, 372–374mailcap file, 396man utility, 223map() function, 4–6, 447, 450Mapping object, 381MAP_PRIVATE flags, 149MAP_SHARED flags, 149Marked-up text, 317Marking up smart ASCII, 292–296markupbuilder.py file, 333–334marshal module, 94, 147Martelli, Alex, 20Matches, finding, 312MatchObject, 248–250, 253math module, 107maxlinelen argument, 351M2Crypto module, 165MD5 cryptographic hash, 167–169MD5 message digests, 167–169md5 module, 167–169md5 object, 167–169

md5.copy() method, 167–168md5.digest() method, 168md5.hexdigest() method, 168md5.md5 class, 167md5.MD5Type constant, 167md5.net class, 167md5.update() method, 168–169

Memory-mapped file objectschanging current file position, 152closing, 149copying substring within, 150creation of, 148–149current file position, 152index position of first substring, 149–150resizing, 151returning string from, 151underlying file size, 152writing into, 153

Mersenne Twister generator, 82Message objects

audio data, 347based on message text, 348cloning, 350dictionary-like behavior, 355–356generator object iterating through, 354–355holding string or Unicode string, 351–352image data, 348MIME content type, 359multipart, 347parsing text message into, 363

prebuilt header, 346serializing, 350–351single part, 348, 349text data, 348

Message payload, 360Message-ID header, 365Methods

built-in Unicode, 186–188documenting base class, 13mx.TextTool module, 308–309regular expressions, 249–255shelve databases, 98string object, 129user-defined classes types, 56–57

MH mailboxes interface, 396mhlib module, 396MIME datatypes, 374–376MIME writer, 396MIME-reading or MIME-writing programs, 396mimetools module, 345, 389, 396mimetools.Message object, 389mimetypes modules, 374–376

.common_types attribute, 375

.encodings_map attribute, 375guess_extension() function, 374–375guess_type() function, 374init() function, 375.inited attribute, 375.knownfiles attribute, 376read_mime_types() function, 375.suffix_map attribute, 376.types_map attribute, 376

MimeWriter module, 345, 396mimify module, 345, 396MiniAEFrame module, 102mkcwproject module, 102mk_unicode_chart.py file, 469–470mmap module, 147–153mmap objects, 148, 150mmap.mmap class, 148–149

.close() method, 149

.find() method, 149–150

.flush() method, 150

.move() method, 150

.read() method, 150–151

.read_byte() method, 151

.readline() method, 151

.resize() method, 151

.seek() method, 152

.size() method, 152

.tell() method, 152

.write() method, 153

.write_byte() method, 153mode attribute, 16

mertz_final_index.fm Page 500 Monday, May 5, 2003 9:26 AM

Page 17: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

501

mod_python, 376Modules

basic string transformations, 128–147building, 106controlling loading, 50email package, 345–348importance of, 41importing, 420installing, 106Internet, 394–399miscellaneous, 105–109multimedia formats, 104–105pathnames searched for, 50platform-specific operations, 100–104regular expressions, 231–255simple pattern matching, 232–234standard, 41–89standard Internet-related tools, 395–398standard library XML, 403–407third-party, 90third-party Internet-related tools, 398–399types, 57

Modulo division, 22most_common() function, 126Mount point, 66Mozilla, 392msvcrt module, 102MTA (Mail Transport Agent), 345MUA (Mail User Agent), 345.__mul__() method, 431multifile module, 285, 345multifile.MultiFile class, 285Multilingual applications, 102Multimedia formats, 104–105MultipartConversionError exception, 347Multiple criteria, 3Multiplication and floating-point numbers, 22Multiproducer, mulitconsumer queue, 108Multithreaded applications, creation of, 108Mutability, 28–29Mutable, 483–484Mutable objects, 42Mutable strings, 130mutex module, 107Mutual exclusion locks, 107mxCrypto module, 165mx.Date module, 86mx.DateTime, xvimx.TextTools module, xvi, 267

attributes, 308–309benchmarks, 287–296classes, 307–308charsplit() function, 311cmp() function, 310

collapse() function, 311commands, 299–300compound matches, 304–305concrete parse tree of components of report,

288constants, 298–299countlines() function, 311find() function, 312findall() function, 312functions, 310–311hex2str() function, 312invset() function, 310isascii() function, 312is_whitespace() function, 312join() function, 313lower() function, 313matching particular characters, 301–302matching sequences, 302–303methods, 308–309modifiers, 305–307multireplace() function, 313–314named jumped targets, 300parser, 319parser generator, 316–328prefix() function, 313replace() function, 314set() function, 310setfind() function, 314setsplit() function, 314setsplitx() function, 315splitat() function, 315splitlines() function, 315splitwords() function, 315str2hex() function, 315suffix() function, 316tag() function, 289, 290, 310-311, 322tag table, 288taglist, 292unconditional commands, 300–301upper() function, 316utility functions, 311–316version of typography() function, 292–295

mx.TextTools commandsAllIn, 301AllInCharSet, 301AllInSet, 301AllNotIn, 301Call, 305CallArg, 305EOF, 303Fail, 300–301Is, 302IsIn, 302IsInCharSet, 302

mertz_final_index.fm Page 501 Monday, May 5, 2003 9:26 AM

Page 18: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

502 I

NDEX

mx.TextTools commands,

continued

IsInSet, 302IsNot, 302IsNotIn, 302.Jump, 300–301Move, 301sFindWord, 303Skip, 301SubTable, 304SubTableInList, 305.sWordEnd, 302–303sWordStart, 302–303Table, 304TableInList, 305Word, 302WordEnd, 302–303WordStart, 302–303

mx.TextTools constantsalpha, 298alphanumeric, 298alphanumeric_set, 298alpha_set, 298A2Z, 298a2z, 298A2Z_set, 298a2z_set, 298any, 299any_set, 299formfeed, 299formfeed_set, 299german_alpha, 298german_alpha_set, 298newline, 299newline_set, 299Umlaute, 298Umlaute_set, 298white, 299white_set, 299whitespace, 299whitespace_set, 299

mx.TextTools modifiersAppendMatch, 306AppendTagobj, 307AppendToTagobj, 306CallTag, 305–306LookAhead, 307

mx.TextTools.BMS class, 307–308.find() method, 308.findall() method, 308.match attribute, 309.search() method, 308.translate attribute, 309

mx.TextTools.CharSet class.contains() method, 309

.match() method, 309

.search() method, 309

.split() method, 309

.splitx() method, 309

.strip() method, 309mx.TextTools.FS class, 307–308

.find() method, 308

.findall() method, 308

.match attribute, 309

.search() method, 308

.translate attribute, 309mx.TextTools.TextSearch class, 307–308

.match attribute, 309

.search() method, 308mxTypography.py utility module, 292, 295–297,

319

N

Nac module, 103Named group backreference (?P=name), 244Named group identifier (?P<name>), 244Named terms as parts of patterns, 262Names, assignment, 418–419Namespaces, 418–421

adding or modifying bindings, 420defining, 430–432

Navigation Services interface, 103ndiff utility, 283Negation operator, SimpleParse, 325Negative lookahead assertion (?!...), 243Negative lookbehind assertion (?<!...), 243Nested loops, exiting gracefully from, 442–443Nested subpatterns, 258Nesting

filter() function, 4–6filters, 3–7map() function, 4–6

netrc file, 396netrc module, 396Netscape OSA modules, 397new module, 107, 445new_email_subjects.py file, 368–369News clients, storing messages, 344–345Newsgroups, 344New-style classes, 11–13.next() method of iterators, 372, 439, 440nis module, 103NIS Yellow Pages, 103NIST (National Institute of Standards and Technol-

ogy), 170NNTP (Network News Transport Protocol), 121

Client applications, 397nntplib module, 344, 397

mertz_final_index.fm Page 502 Monday, May 5, 2003 9:26 AM

Page 19: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

503

Node class, 332, 333Nodes, 260Non-alphanumeric character class (\W), 239Non-backreferenced atom (?:...), 242–243Non-digit character class (\D), 238Nonempty sequence, 83Non-greedy bounded quantifier, 242Non-greedy existential quantifier (+?), 241Non-greedy potentiality quantifier (??), 241Non-greedy quantifiers, 215Non-greedy universal quantifier (*?), 240Nonlooping tag table, 297Non-Windows systems, detaching applications, 80Non-word boundary (\B), 240nsremote module, 397Numbers, pretty printing, 229–231Numeric comparison operators, 21, 25Numeric error code, 80Numeric types, capabilities, 10Numeric values, encoding compactly, 84–86

O

object type, 14object.__eq__ method, 14object.__ne__ method, 14object.__len__ method, 14object.__nonzero__ method, 14object.__repr__ method, 15object.__str__ method, 15

Objectsbinding names to, 42binding trap, 42built-in, 89converting to strings, 90copying, 42–44creation in customizable ways, 107customizing string representation, 96–98datatypes, 54–55deep copy, 43–44enhanced, 11–13equality, 14file-like, 9–11, 15–17, 68immutable, 427inequality, 14inspecting, 107length of, 14list-like, 28magic methods, 11–13mutable, 28–29, 42, 427–428naming, 418–421number of references to, 53persistent, 41, 90pickling behavior, 93–94

recursive containers, 95references not limiting garbage collection,

109restricted access, 105serializing, 90–100shallow copy, 43–44snap shots, 42–43standard type names, 98storing, 90–100tuple-like, 28types, 53–57writing serialized form of, 93

oct() function, 19Octal numerals, 130Octal string, 19Open file objects, 55open function, 15–16Operating systems

accessing features, 74–82identifying, 82string referring to current directory, 81

operator module, 47–48optik module, 45optparse module, 45or operator, 434os module, 74–82, 102

access() function, 74–75altsep attribute, 81chdir() function, 75chmod() function, 75chown() function, 75chroot() function, 75curdir, 81defpath, 81environ variable, 78, 81OSError error, 78, 79, 81fstat() function, 69–71getcwd() function, 75getenv() function, 75–76getpid() function, 76kill() function, 76linesep attribute, 82link() function, 76listdir() function, 57, 76lstat() function, 69–71, 76mkdir() function, 76–77mkdirs() function, 77mkfifo() function, 77name attribute, 82nice() function, 77pardir attribute, 82os.path module, 65–68, 74

abspath() function, 65basename() function, 65

mertz_final_index.fm Page 503 Monday, May 5, 2003 9:26 AM

Page 20: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

504 I

NDEX

os module,

continued

commonprefix() function, 65dirname() function, 65exists() function, 65expanduser() function, 65expandvars() function, 66getatime() function, 66getmtime() function, 66getsize() function, 66isabs() function, 66isdir() funcation, 66isfile() function, 66islink() function, 66ismount() function, 66join() function, 66normcase() function, 67normpath() function, 67realpath() function, 67samefile() function, 67sameopenfile() function, 67split() function, 67splitdrive() function, 67walk() function, 67–68

pathsep attribute, 82popen() function, 77popen2() function, 77–78popen3() function, 78popen4() function, 78putenv() function, 78readlink() function, 78remove() function, 78removedirs() function, 79rename() function, 79renames() function, 79rmdir() function, 79sep attribute, 82startfile() function, 79stat() function, 69–71, 79–80strerror() function, 80symlink() function, 80system() function, 80tempnam() function, 80tmpfile() function, 80uname() function, 81unlink() function, 81utime() function, 81

os2 module, 74Output file, decoding contents of argument, 161

P

p_*() function, 338Packages, 106, 420Packed binary strings, 84–86

p_add() rule, 340Paragraphs

counting, 120–121reading files backwards by, 126–128reformatting, 115–117

Parser libraries, 282–341parser module, 285Parser state machine, 340.parserbyname() method, SimpleParse, 322parser.out file, 340Parsers, 257, 261, 267

data becoming deep, 258–260grammar, 260–263HTML, 384–388interfaces, 285PLY module, 329mx.TextTools, 286-316SimpleParse, 316-328specialized, 282–286text becoming stateful, 258–260tokens, 261XHTML, 384–388yacc module, 339

parsetab.py file, 340Parsing

buyer/order report, 287–289command-line options, 44–47compound address, 365data with regular expressions, 223HTML files, 285input string, 339low-level state machine, 286–316pencil-and-paper, 264–265text message into message object, 363token list, 332–334URLs (Uniform Resource Locators), 392–394Windows-style configuration files, 282–283XML (Extensible Markup Language), 407

PasswordsASCII 13-byte encrypted, 166collecting without echoing to screen, 106POP3 server, 369

patch utility, 283Path delimiters, 82PATH environment variable, 81Path symbolic link, 78Pathnames, 60, 64–68Paths, 65–68

controlling module loading, 50directory listings, 58permissions, 74–75

Pattern modifiers (?Limsux), 242PatternObject, 248–250Patterns

mertz_final_index.fm Page 504 Monday, May 5, 2003 9:26 AM

Page 21: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

505

case-insensitive match, 233case-sensitive match, 233converting, 235function-defined, 336glob-style matching, 232–234listing elements matching, 233–234matching filenames against, 232–234matching string, 233regular expressions, 207–208simple matching, 232–234

pcre module, 231pdb module, 107Pencil-and-paper parsing, 264–265Permission bits, copying, 68Permissions

copying data, 69paths, 74–75

p_error() function, 340Persistent

files and strings, 147objects, 41

Peters, Tim, x, 20pickle module, 93–94, 106, 147

dump() function, 93dumps() function, 93–94load() function, 94loads() function, 94

pickle.Pickler class, 93Pipe character, 208Piped streams, 51Pipes, managing, 103pipes module, 103PixMap objects, 103PixMapWrapper module, 103PKZip, 176plain declaration, 334plain node, 334plain object, 334Plaintext string, 170plain_words tag table, 296Platforms

attributes, 80identifying, 50managing pipes, 103native byte order (endianness), 49

Platform-specific operations modules, 100–104PLY applications, token stream creation, 329–335PLY grammar, 334, 340PLY package

action code, 329allowable error conditions, 329error correction facility, 328error reporting, 328grammar rules, 337–338

LEX, 335–337lexer/tokenizer, 329lexing module, 335–337parser, 329parsing token list, 332–334productions, 337–338self-referential rules, 334speed, 328yacc module, 337–339

PLY parsers, xvi, 340–341Polymorphism, 8

capability-based, 10enhanced objects, 11–13identifying file-like objects, 9–11Pythonic, 9–11

POP3 clients, custom, 368–370POP3 protocol, 366, 369POP3 server, 369–370popen2 module, 107poplib module, 343, 344, 368–370

POP3 class, 369.apop() method, 369.dele() method, 369.pass_() method, 369.quit() method, 369.retr() method, 369.rset() method, 369.stat() method, 370.top() method, 370.user() method, 370

Positive lookahead assertion (?=...), 243Positive lookbehind assertion (?<=...), 243posix module, 74, 103POSIX tty control, 103posixfile module, 103Potentiality quantifier (?), 241, 324pprint module, 94–96

isrecursive() function, 95pformat() function, 96pprint() function, 96PrettyPrinter class, 96

pre module, 231, 234precedence variable, 340Predicative functions, 6–7Preferences manager for Python, 103pretty_nums.py file, 230–231Pretty-printing numbers, 229–231pretty-printing object, 96pricinglpy support data, 280Print calendars, 100print command, 352, 355print statement, 15, 51, 425–426Printable characters, 132

mertz_final_index.fm Page 505 Monday, May 5, 2003 9:26 AM

Page 22: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

506 I

NDEX

Printingcalendars, 100–101datatypes, 425–427directory comparison report, 59–60formatted representation of object, 96reports on code, 107

Processescreation and management, 74ids, 76processor time, 87

profile module, 107Programmatic parsing, high-level, 328–341Programming languages, nesting, 259Programs

location of output, 49names in, 42

p_rulename() form, 332Pseudo terminal utilities, 103pseudo-random value generator, 82–84pstats module, 107pty module, 103Public-key encryption, 164, 484Punctuation, 131pwd module, 103.py files, 106, 108pyclbr module, 107py_compile module, 108pydoc module, 108py_resource module, 103Python

as byte-code compiled programming lan-guage, 418

container classes, 411–412dynamically and strongly typed, 418exiting, 52grammar, 261–262parser libraries, 282–341virtual machine, 418

Python & XML,

(Jones & Drake), xv, 399Python class browser, 107Python Codec Registry, 189Python Cookbook Web site, 112, 203Python Cryptography modules, 165Python DBM interface, 92Python debugger, 107

Python Essential Reference, Second Edition,

(Beazley), xv

Python Imaging Library,

104Python introspection, 328Python Lex-Yacc, 328–341

Python Library Reference,

xiii, 49, 74, 98, 366Python newsgroup, xivPython objects, 410–411, 413Python scripts, 49

Python Standard Library,

(Lundh), xv

Python Tutorial,

(Rossum), 417

Python Tutorial

Web site, 1Python Web site, xiv, 417Python XML-SIG, 408Pythonic polymorphism, 9–11pythonprefs module, 103PYX module, 414

document format, 414home page, 408

PyXML package, 408, 413

Q

Quantifiers, 240-242non-greedy, 215PLY grammar, 334SimpleParse module, 323–324

QUERY_STRING environment variable, 377, 378Queue module, 108

Quick Python Book, The,

(Harms & McDonald), xvQuickly sorting lines on custom criteria, 112–115QuickTime movies, 105quietconsole module, 103Quixote module, 398Quixote Web site, 398quopri encoding, 187quopri module, 122, 162

decode() function, 162decodestring() function, 162encode() function, 162encodestring() function, 162

Quoted printable encoding, 162Quoted Printable string, 159

R

rand() function, 82randline module, 40Random element, 83Random floating point value, 84Random generator, 84random module, 82–84

betavariate() function, 82choice() function, 83cunifvariate() function, 83expovariate() function, 83gamma() function, 83gauss() function, 83lognormvariate() function, 83normalvariate() function, 83paretovariate() function, 83Random class, 82random() function, 83

mertz_final_index.fm Page 506 Monday, May 5, 2003 9:26 AM

Page 23: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

507

randrange() function, 83seed() function, 84shuffle() function, 84uniform() function, 84vonmisesvariate() function, 84weibullvariate() function, 84

Ranges, 83Rapid searching, 201Ratio, 21raw_input() function, 446raw-unicode-escape encoding, 187re module, 147, 203, 215–216, 218, 231, 236–255

re.engine constant, 245re.error exception, 255re.escape() function, 245re.findall() function, 245re.I constant, 244re.IGNORECASE constant, 244re.L constant, 244re.LOCALE constant, 244re.M constant, 244re.compile() class factory, 248

.findall() method, 249

.flags attribute, 249

.groupindex attribute, 249

.match() method, 250

.pattern attribute, 250

.search() method, 250–251

.split() method, 251

.sub() method, 251

.subn() method, 252re.match() class factory, 248–249

.end() method, 252

.endpos attribute, 252

.expand() method, 252–253

.groupdict() method, 253

.grouping, 253

.groups() method, 253–254

.lastgroup attribute, 254

.lastindex attribute, 254

.pos attribute, 254

.re attribute, 254

.span() method, 254

.start() method, 255

.string attribute, 255re.MULTILINE constant, 244re.purge() function, 246re.S constant, 244re.search() class factory, 249

.end() method, 252

.endpos attribute, 252

.expand() method, 252–253

.groupdict() method, 253

.grouping, 253

.groups() method, 253–254

.lastgroup attribute, 254

.lastindex attribute, 254

.pos attribute, 254

.re attribute, 254

.span() method, 254

.start() method, 255

.string attribute, 255re.split() function, 223, 246re.sub() function, 213, 246–248re.subn() function, 248re.U constant, 245re.UNICODE constant, 245re.VERBOSE constant, 245re.X constant, 245

.read() method, 17, 37, 147, 285, 389read.backwards.pyutility, 127–128Reading

AIFC audio files, 104AIFF audio files, 104directory listings, 57–58file backwards by record, line, or paragraph,

126–128file in line-oriented style, 37–38gzip object, 174IFF audio data, 104line from file, 38–39lines with continuation characters, 226–227multiple files, 61–63URLs (Uniform Resource Locators), 391ZIP files, 176–181

.readline() method, 17, 37, 63, 285, 381, 389readline module, 108.readlines() method, 17, 37, 285, 381, 389rebinding names, 447reconvert module, 235reconvert.convert() function, 235Records, reading files backwards by, 126–128Recursive containers, 95Recursive objects, 97Redirected input, 61–63Redirected streams, 51re.DOTALL constant, 244reduce() function, 435–438, 447, 450Referential transparency, 484reformat_para.py file, 116–117Reformatting paragraphs, 115–117regex module, 231, 235Regular expressions, 194–195, 203–204

advanced extensions, 215–220alphanumeric character class (), 239alternation operator (|), 208, 240any character (.) wildcard, 207atomic operators, 236–240

mertz_final_index.fm Page 507 Monday, May 5, 2003 9:26 AM

Page 24: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

508 I

NDEX

Regular expressions,

continued

atoms, 207attributes, 249–255backreferences, 210–211, 214, 218, 238backslash character , 206basic probability, 212beginning of line (^), 206, 239beginning of string (\A), 239bounded numeric quantifier, 242character class, 208character classes, 208, 238character codepages, 217checking for server errors, 224–226class factories, 248–249clearing cache, 246comments, 220comments (?#...), 242concept of state, 259constants, 244–245continuing over multiple lines, 220curly-brace quantification (), 210definition of, 204–205deprecated modules, 235detecting duplicate words, 223–224digit character class (\d), 208, 238end of line ($), 206, 239end of string (\Z), 239Escape (\) atomic operator, 236escape-style shortcuts, 208exact numeric quantifier, 241existential quantifier (+), 240extensions, 209-210, 215–220functions, 245–248grouping, 207grouping operators, 237–238group-like patterns, 242–244how many times atom occurs, 209identifying floating point, 262identifying URLs and email addresses in text,

228–229invalid, 255limitations, 203listing substrings, 246literal characters, 205, 207locating matched pattern, 213lookahead assertions, 219lookbehind assertions, 219–220looking for begin-line and end-line characters,

216lower-bound quantifier, 241matching patterns in text, 205–214matching too much, 211–212matching zero-length pattern for line begin-

nings (^), 208

methods, 249–255modifiers, 215–217modifying target text, 214modules, 234–255named group backreference (?P=name), 244named group identifier (?P<name>), 244negative lookahead assertion (?!...), 243negative lookbehind assertion (?<!...), 243newline character, 207non-alphanumeric character class (), 239non-backreferenced atom (?:...), 242–243non-digit character class (\D), 238non-greedy bounded quantifier, 242non-greedy existential quantifier (+), 241non-greedy potentiality quantifier (??), 241non-greedy quantifiers, 215non-greedy universal quantifier (*?), 240non-whitespace character class, 239non-word boundary, 240one or more times (+), 209operations, 236–255parsing data, 223pattern modifiers (?Limsux), 242pattern summary, 236, 237patterns, 207–208patterns matching tokens, 335positive lookahead assertion (?=...), 243positive lookbehind assertion (?<=...), 243potentiality quantifier (?), 241pretty-printing numbers, 229–231quantifiers, 209, 211–212, 240–242quoting as raw strings, 206reading lines with continuation characters,

226–227replacement patterns to accompany matches,

213reverse character class (^), 208spaces, 205standard modules, 231–255substituting literal text for literal text, 213summarizing command-line option documen-

tation, 221–223symbols with special meaning, 206text block flush left, 220–221Unicode alphabetic characters, 218universal quantifier (*), 240versions and optimizations, 231–232whitespace, 214whitespace character () shortcut, 208, 239wildcard character, 239word boundary, 239zero or one times (+), 209zero-width match, 206zero-width positional patterns, 207

mertz_final_index.fm Page 508 Monday, May 5, 2003 9:26 AM

Page 25: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

I

NDEX

509

Regular files, 66, 70Relative file positioning, 156RELAX NG, 401Remote procedure calls, 407REMOTE_ADDR environment variable, 378re_new() function, 213repl function, 247"".replace() string method, 139–140Replacement backreferences, 214report2data() function, 289reporthook() function, 390Reports

concrete parse tree of components, 288other ways of processing, 281–282processing with concrete state machine,

274–280repr() function, 15, 229.__repr__() method, 54, 431

repr module, 96–98repr.maxlevel attribute, 97repr.maxlist attribute, 97repr.maxlong attribute, 98repr.maxother attribute, 98repr.maxstring attribute, 98repr.maxtuple attribute, 97repr.Repr() class, 97repr.repr() function, 98repr.repr_TYPE() function, 98

Representations, 97–98re_show() function, 205, 213, 217Resource fork, 161resource module, 103Resources, 390, 392return statement, 439rexec module, 108RFC-822, 344RFC-2822 date string, 365RFC-2231 encoded string, 364RFC-822 message manipulation class, 397RFC-2822 messages, 345, 350rfc822 module, 345, 397RFC-2822-formatted date, 364rfc822.Mailbox class, 372"".rfind() string method, 140–141RGB color model, 104rgbimg module, 105"".rindex() string method, 141riscos module, 74"".rjust() string method, 141–142rlcompleter module, 108RLE (run-length encoding), 455–456robotparser module, 285robots.txt access control file, 285rot13 encoding, 187

rotor objects, creation of, 169–170rotor.newrotor class, 169–170

.decrypt() method, 170

.decryptmore() method, 170

.encrypt() method, 170

.encryptmore() method, 170

.setkey() method, 170"".rstrip() string method, 142Runtime environment, 49–53RuntimeError exception, 281

S

salutation() function, 450–451Sample buyer/order report, 275SAX events, 406, 413SAX extension, 413SAX handlers, 406SAX parsers, 407SAX (Simple API for XML), 404–406sched module, 108Schneier, Bruce, 169Schwartz, Randal, 113Schwartzian Transforms, 113-115Scripting languages templating system, 34–35Scripts

locating resources, 102supporting old, 58

ScrolledText module, 108Search paths, 82Searching

dictionaries, 25–26rapid, 201

Secret Labs Regular Expression Engine, 236sed, 204.seek() method, 17, 147, 285 select module, 397self argument, 13self object, 15send_email.py file, 370–371Sequence operator (,), 325Sequences, 450

combining, 449difference and similarity of pairs, 283–284indexes, 449

Serial to line connections, 102Serialized objects, 94Serializing objects, 90–100Servers, checking for errors, 224–226services() function, 223Set datatype, 429–430.__setitem__() method, 431sets.Set module, 429–430.__setstate__() method, 93

mertz_final_index.fm Page 509 Monday, May 5, 2003 9:26 AM

Page 26: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

510 I

NDEX

SGI, 104–105SGI systems (IRIX), 101SGML (Standard Generalized Markup Language),

285sgmllib module, 285SHA message digests, 170–172sha module, 167, 170–172

new class, 170–171sha class, 171

.copy() method, 171

.digest() method, 171

.hexdigest() method, 172

.update() method, 172sha object, 170–172SHA (Secure Hash Algorithm), 167, 170–172Shallow copy, 43–44sha.py file, 197Shared objects, 101Shared-key encryption, 484shelve databases, 98–99shelve module, 98–100, 147shlex module, 286shortline() function, 4show_opts() function, 223show_services.py file, 222shutil module, 68–69, 74

copy() function, 68copy2() function, 68copyfile() function, 68copyfileobj() function, 68copymode() function, 68copystat() function, 69copytree() function, 69rmtree() function, 69

Side effects, 447signal module, 108Signature-based functions, 8Silicon Graphics' Graphics Library, 104Simple generators, 439–441simple.cgi script, 377SimpleHTTPServer module, 105SimpleParse EBNF-style grammar

declaration patterns, 321–323quantified potentiality, 327–328

SimpleParse module, xvi, 286backtracking, 328grammar defining structure of processed text,

316literals, 323production, 322quantifiers, 323–324structures, 324–326taglist creation, 319traversal and use of generated mx.TextTools

taglist, 317useful productions, 326–327

simpleparse.common.calendar_names production, 326

simpleparse.common.chartypes production, 326simpleparse.common.comments production, 326simpleparse.common.iso_date production, 327simpleparse.common.iso_date_loose production,

327simpleparse.common.numbers production, 327simpleparse.common.phonetics production, 327simpleparse.common.string production, 327simpleparse.common.timezone_names produc-

tion, 327simpleTypography.py file, 317–320SimpleXMLRPCServer module, 407Siong, Ng Pheng, 165SIT, 173site module, 108.skip() method, PLY, 336slice() function, 57Slices, 43, 312–314Smart ASCII, marking up, 292–296, 317–318, 329Smart ASCII format, 272SMTP clients, 370–371SMTP server, 371SMTP (Simple Mail Transport Protocol), 121smtplib module, 343, 344, 370–371

smptlib.SMTP class, 371.login() method, 371.quit() method, 371.sendmail() method, 371

Snap shots, 42–43sndhdr module, 347, 397socket module, 343, 397Sockets, 70SocketServer module, 397Solutions, expressed in terms of what, 447.sort() method, 112–113Sorting

custom algorithm, 113–115custom comparison function, 113lines quickly on custom criteria, 112–115lists, 32, 112–113maintaining order, 105unnatural, 113

Sound file formats, 397Source code

analyzing, 106compiling possible incomplete, 106data as, 445–446mixed use of tabs and spaces, 286printing reports on, 107profiling performance characteristics, 107

mertz_final_index.fm Page 510 Monday, May 5, 2003 9:26 AM

Page 27: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

INDEX 511

SourceForge Web site, xv, xviSpaces and tabs, 299Spam, 345"Spam Filtering Techniques," 345SpamBayes, 345Spark module, 328Special data values and formats, 82–89Special list functions, 448–449Specialized parsers, 282–286Specializing classes, 11Speech Manager interface, 102"".split() string method, 121, 142–144"".splitlines() string method, 144Splitting strings, 142–144, 216sprintf() function, 35sre module, 231, 236Stack frames, 49Stack traces, 109Standard color selection dialog interface, 101Standard Internet-related tools, 395–398Standard library

specialized parsers, 282–286text processing tools, 287XML modules, 403–407

Standard modules, 41–89Standard operations as functions, 47–48"".startswith() string method, 144.startswith() method, 3startText() function, 273Startup module, 108stat module, 69–71

S_ISBLK() function, 70S_ISCHR() function, 70S_ISDIR() function, 69–70S_ISFIFO() function, 70S_ISLNK() function, 70S_ISREG() function, 70S_ISSOCK() function, 70ST_ATIME constant, 70ST_CTIME constant, 71ST_DEV constant, 70ST_GID constant, 70ST_INO constant, 70ST_MODE constant, 70ST_MTIME constant, 71ST_NLINK constant, 70ST_SIZE constant, 70ST_UID constant, 70

statcache module, 58, 108State function body, 279State machines, 257

abstracting form, 273–274block-level, 292defining, 267–268

describing, 340input loop file, 272–273low-level parsing, 286–316parsers, 267state reuse, 280–281subgraphs, 280–281tag table, 288text processing, 268–269when not to use, 269–272when to use, 272–273

state variable, 273Stateful text, 258–260Stateful text file, 269StateMachine class, 274statemachine.py file, 273–274States

buyers tag table, 289–290concept of, 259diagram, 281handlers, 274, 279reuse, 280–281special behavior, 288tag table, 288tag tables, 289transitions, 279

stat_result object, 79statvfs module, 108STDERR pipes, 78STDERR (standard error stream), 425–427

file object, 50–51functions to spawn commands with pipes, 107

STDIN pipes, 77–78STDIN (standard input stream), 61, 147

file object, 51functions to spawn commands with pipes, 107

STDOUT pipes, 77–78STDOUT (standard output stream), 61–62, 147,

425–427buffered, nonvisible output, 103file object, 51functions to spawn commands with pipes, 107

StopIteration exception, 440, 441Storage object, 381Storing objects, 90–100str() function, 10, 15, 229, 355, 389, 450.__str__() method, 54, 352, 431str type, 33–34, 422-423

str.__add__() method, 33str.contains__() method, 33–34str.__getitem__() method, 33str.__getslice__() method, 33str.__hash__() method, 33str.__len__() method, 33str.__mul__() method, 33

mertz_final_index.fm Page 511 Monday, May 5, 2003 9:26 AM

Page 28: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

512 INDEX

str type, continuedstr.__rmul__() method, 33

stray_punct tag table, 295–296Strict encoding, 187String buffer, 153–158String buffer objects, 153–154string datatype, 422–423String delimiting search paths, 82string functions, 111String interpolation datatypes, 423–425String methods, 111string module, 33, 111, 128–147

atof() function, 132atoi() function, 132atol() function, 132capitalize() function, 132–133capwords() function, 133center() function, 133count() function, 134digits constant, 130expandtabs() function, 134–135find() function, 135, 200, 221hexdigits constant, 130index() function, 135–136join() function, 129, 130, 137, 143joinfields() function, 138letters constant, 131ljust() function, 138lower() function, 138lowercase constant, 131lstrip() function, 139maketrans() function, 139, 145octdigits constant, 130printable constant, 132punctuation constant, 131replace() function, 129, 139–140, 213, 221rfind() function, 140–141rindex() function, 141rjust() function, 141–142rstrip() function, 142split() function, 37, 129, 137, 142–144, 223splitfields() function, 144strip() function, 144swapcase() function, 145translate() function, 139, 145–146upper() function, 146uppercase constant, 131whitespace constant, 131–132zfill() function, 146–147

string object, 129Stringifying calendars, 100–101StringIO module, 153–158

StringIO.StringIO class, 153-155.close() method, 155

.flush() method, 155

.getvalue() method, 155

.isatty() method, 155

.read() method, 156

.readline() method, 156

.readlines() method, 156

.seek() method, 156

.tell() method, 157

.truncate() method, 157

.write() method, 157

.writelines() method, 157–158Strings

all non-alphanumeric characters escaped, 245applying tag table, 310–311backslashes and double quotes escaped, 365base64 encoded, 158based on hex-encoded string hexstr, 312basic transformations, 128–147beginning of, 144Boolean values indicating property, 1368-byte, 186–188capitalized words, 133composed of slices from other strings, 313concatenating elements of list, 137concatenating to sha object, 172converting, 132converting letter case, 145converting objects to, 90cryptographic hash, 167–169customizing representation, 96–98default length, 98delimiting lines in file, 82dictionary-based interpolation, 35–36double quotes or angle brackets removed, 365ending, 134extracting content from fillers, 194–195file-based interface, 161–162as files, 147–158finding first occurence of any character, 314fuzzy matching against patterns, 283hexadecimal representation, 315identifying operating system, 82immutability, 129index position of substring, 135–136initial character converted to uppercase, 132–133interpolated values, 423–425interpolation of special characters, 35interpreter version information, 51–52leading and trailing whitespace characters

removed, 144leading whitespace characters removed, 139length of, 85listing

keys, 380

mertz_final_index.fm Page 512 Monday, May 5, 2003 9:26 AM

Page 29: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

INDEX 513

lines, 144nonoverlapping substrings, 142–144in string buffer, 156

lowercase letters converted to uppercase, 146, 316

magic methods, 33manipulating image data stored as, 104message text contained in, 348modifying, 145–146multiple operations on, 4mutable, 130nonmagic methods, 33nonoverlapping occurrences of pattern, 245with normalized whitespace, 311occurrences of old replaced by new, 139–140one-byte from current position, 151packed values, 85padded with

leading spaces, 141–142leading zeros, 146–147symmetrical leading and trailing spaces,

133trailing spaces, 138

partial interpolation, 36–37path delimiters, 82path symbolic link, 78patterns matching, 233persistence, 147prefix in tuple, 313presence of single character in, 33as Python keyword, 107reading and writing, 15–17referring to directory, 81–82replacing, 314returning

from gzip object, 175from memory-mapped file object, 151from string buffer, 156

RFC-2231-encoded string, 364serialized form of object, 94with special characters escaped, 390splitting, 142–144, 216, 314–315starting at current file position, 150–151substrings, 134, 140tabs replaced by spaces, 134–135tagging, 290trailing whitespace characters removed, 142translation table, 139, 145–146uppercase letters converted to lowercase, 138,

313zlib compressed version, 182

"".strip() function, 144Strongly emphasized text, 317strongs tag table, 296

struct module, 84–86calcsize() function, 85pack() function, 85unpack() function, 86

Structured data, 90Structured text database, 484Structures

recursively containing themselves, 263SimpleParse module, 324–326

Subdirectories, 60Subelements, 400Subgraphs, 280–281Subshell, executing cmd command, 80Substrings, 134

copying within memory-mapped file object, 150

index position, 135–136, 140–141listing nonoverlapping, 142–144strings splitting into, 315

4Suite, 408, 414–415Summarizing command-line option documenta-

tion, 221–223Sun AU audio files, 105Sun audio hardware interface, 105sunau module, 105SUNAUDIODEV module, 105sunaudiodev module, 105"".swapcase() function, 145switch statement, 320symbol module, 285Symbolic constants, 69Symbolic links, 66–67, 69–70Symmetrical encryption, 163, 484sys function, 55sys module, 49–53, 74

sys.argv attribute, 49sys.byteorder attribute, 49sys.copyright attribute, 49sys.displayhook() function, 49sys.excepthook() function, 49sys.exc_traceback attribute, 49sys.exc_type attribute, 49sys.exc_value attribute, 49sys.execprefix attribute, 49sys.executable attribute, 49sys.exit() function, 52sys.getdefaultencoding() function, 52sys.getrefcount() function, 53sys.hexversion attribute, 50sys.last_traceback attribute, 49sys.last_type attribute, 49sys.last_value attribute, 49sys.maxint attribute, 50sys.maxunicode attribute, 50

mertz_final_index.fm Page 513 Monday, May 5, 2003 9:26 AM

Page 30: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

514 INDEX

sys module, continuedsys.path attribute, 50sys.platform attribute, 50sys.stderr attribute, 50–51sys.__stderr__ attribute, 50–51sys.stdin attribute, 51sys.__stdin__ attribute, 51sys.stdout attribute, 51sys.__stdout__ attribute, 51sys.stdout.write() method, 425sys.tracebacklimit attribute, 49sys.version attribute, 51–52sys.version_info attribute, 52

syslog module, 103System configuration, 74SystemExit exception, 52

Ttabnanny module, 286tag() function, 296Tag tables, 288

applying to string, 310–311based on EBNF grammars, 316changing read-head position, 301correctly configuring, 299–300debugging, 297–298defining, 289documentary purposes, 300–301jump conditions, 299–300modifying, 320nonlooping, 297states, 288–289success state, 288tuple, 299type of pattern to match, 299

Tagging strings, 290Taglist, 292

comparing tuples on slice positions, 310generating, 318–320markup production, 319output, 321unreported production, 322usage, 318–320

TagStack class, 386tag_words tag table, 296Tasks

column statistics for delimited or flat-record files, 117–120

counting characters, words, lines and para-graphs, 120–121

quickly sorting lines on custom criteria, 112–115

reading file backwards by record, line, or para-

graph, 126–128reformatting paragraphs of text, 115–117text processing applications, 41transmitting binary data as ASCII, 121–123word or letter histograms, 123–126

TCL/Tk Python interface, 108.tell() method, 17, 285Telnet clients, 397telnetlib module, 343, 397tempfile module, 71

mktemp() function, 71TemporaryFile() function, 71

Temporaryfile object, 71filenames, 71files, 71, 80

TERMIOS module, 103termios module, 103t_error() function, 336Testing

capabilities, 10–11files, 69–71

Text, ix–xall letters, 131custom processing, xdescribing complex patterns, 204fast manipulation tools, 286–316identifying URLs and email addresses,

228–229locating patterns, xlowercase letters, 131matching patterns, 205–214processing, 2, 111reformatting paragraphs, 115–117stateful, 258–260uppercase letters, 131

Text blocks, 220–221Text editors, x, 115Text files

composed of multiple delimited parts as sev-eral files, 285

processing, 268–269stateful, 269stateful chunk, 268when not to use state machine, 269–272when to use state machine, 272–273

Text processingdefinition of, ix–xfilters, 3–4frequency, xiHOFs (higher-order functions), 1–7Internet protocols, 343large chunks of text, 2log files, 2–3

mertz_final_index.fm Page 514 Monday, May 5, 2003 9:26 AM

Page 31: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

INDEX 515

philosophy, x–xistate machines, 268–269stateful, 267tasks, xi, 41

Textual sources and log files, 2–3textwrap module, 115Third-party

Internet-related tools, 398–399modules, 90XML-related tools, 408–416

thread module, 108Threaded applications, 107Threaded programming, 108threading module, 108Threat model, 196–198tidy, 384t_ignore variable, 331, 336Time, 86–89time module, 86–89

accept2dyear attribute, 86altzone constant, 86–87asctime() function, 87clock() function, 87ctime() function, 87daylight constant, 86–87gmtime() function, 88localtime() function, 88mktime() function, 88sleep() function, 88strftime() function, 88strptime() function, 89time() function, 89timezone constant, 86–87tzname constant, 86–87

Time tuple, 86–88Timestamps

copying data, 69email, 365

Timezone, 86"".title() function, 133Tix module, 108Tkinter module, 108t_MDASH(), 332t_newline() function, 331Token list, 329–335.token() method, 336token module, 285Token patterns, 335–336tokenize module, 285Tokenizers, 261, 285Tokens, 261

attributes, 335getting for grammar rules, 339identifying, 329

listing types, 330–331regular expressions matching, 335types, 329

tokens variable, 330, 335traceback module, 109traceback objects, 57"".translate() function, 145–146Transmitting binary data as ASCII, 121–123True division operator (/), 22t_RULENAME form, 329truncate method, 17try statements, 52, 443try/except/else statement, 443–444try/finally statement, 443–444tty module, 103tuple() function, 10tuple type, 28–32

tuple.__add__() method, 29–30tuple.__contains__() method, 30tuple.__getitem__() method, 30tuple.__getslice__() method, 30tuple.__hash__() method, 30–31tuple.__len__() method, 31tuple.__mul__() method, 31tuple.__rmul__() method, 31

Tuple-like objects, 28tuples, 316, 427Turing, Alan, 169turtle module, 108Twisted Matrix Laboratories Web site, 398Twisted Matrix library, 45Twisted module, 398twisted.python.usage module, 45Txt2Html utility, 272–273, 287, 292, 317.type attribute, PLY, 335type() function, 53–55, 421TYPE type, 98Typed arrays of numeric values, 105TypeError exception, 12, 30, 112, 157Types, 53–57types module, 53–57

BufferType constant, 55BuildinFunctionType constant, 55BuildinMethodType constant, 55ClassType constant, 55CodeType constant, 55ComplexType constant, 55DictionaryType constant, 55DictType constant, 55EllipsisType constant, 55FileType constant, 55–56FloatType constant, 56FrameType constant, 56FunctionType constant, 56

mertz_final_index.fm Page 515 Monday, May 5, 2003 9:26 AM

Page 32: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

516 INDEX

types module, continuedGeneratorType constant, 56InstanceType constant, 56IntType constant, 56LambdaType constant, 56ListType constant, 56LongType constant, 56MethodType constant, 56ModuleType constant, 57NoneType constant, 57SliceType constant, 57StringType constant, 57StringTypes constant, 57TracebackType constant, 57TupleType constant, 57TypeType constant, 57UnboundMethodType constant, 56UnicodeType constant, 57XRangeType constant, 57

typography() function, 292–295typo_html.py file, 320

Uu"".encode() method, 188ulrparse module, 392–394ulrparse.urlparse() function, 393–394unary minus (-) operator, 22Unconditional commands, 300–301unichr() method, 188Unicode, 465

built-in functions and methods, 186–188CJK (Chinese-Japanese-Korean) alphabets,

185codepoint information, 191declarations, 468–469default string encoding, 52definition of, 466encodings, 185, 467finding codepoints, 469–470native support, 185resources, 470UTF-8, 185UTF-16, 185UTF-32, 185

Unicode characters, 50, 191–193Unicode Consortium, 466unicode datatype, 423Unicode file, 189–190unicode() function, 10–11, 186–188, 353Unicode object, 186–188Unicode string object, 188Unicode strings, 112, 186–188, 423unicodedata module, 191–193

bidirectional() function, 192category() function, 192combining() function, 192decimal() function, 192decomposition() function, 192–193digit() fuction, 193lookup() function, 193mirrored() function, 193name() function, 193numeric() function, 193

UnicodeError exception, 187, 189unicode-escape encoding, 187Unit testing framework, 109unittest module, 109Universal quantifier (*), 240, 323Unix, 102–103Unix (n) dmb library interface, 92Unix password database, 103Unix shell-like syntaxes, 286Unix syslog library interface, 103Unix-like directories, 77Unix-like systems

detailed information about current operating system, 81

hard link from path, 76killing external process, 76mailcap file, 396netrc file, 396path symbolic link, 78processing permissions, 74root directory, 75soft link between paths, 80wc utility, 120

Unix-style passwords, 166untabify.py utility, 221Updating

dictionaries, 27os.enviorn variable, 78

"".upper() function, 146Uppercase letters, 131urldump.py file, 284urlencoded query for POST or GET request,

390–391url_examine.py file, 224

urllib module, 388–392, 398FancyURLopener class, 391quote() function, 390quote_plus() function, 390unquote() function, 390unquote_plus() function, 390URFancyLopener.version attribute, 392urlencode() function, 389–391URLFancyopener.get_user_passwd() method,

391

mertz_final_index.fm Page 516 Monday, May 5, 2003 9:26 AM

Page 33: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

INDEX 517

URLFancyopener.open() method, 392URLFancyopener.open_unknown() method,

392URLFancyopener.prompt_user_passwd()

method, 392URLFancyopener.retrieve() method, 392urlopen() function, 389URLopener.open() method, 392URLopener.open_unknown() method, 392URLopener.retrieve() method, 392URLopener.version attribute, 392urlretrieve() function, 390

urllib.URLopener class, 391urllib objects interface, 388.close() method, 389

urllib2 module, 398urlparse module, 282

urljoin() function, 394urlunparse() function, 394

URLs (Uniform Resource Locators), 389components, 392–394constructing, 394copying, 392identifying, 228–229opening, 392parsing, 392–394reading, 391

US-ASCII encoding, 186user module, 108User-defined classes, 55–57UserDict module, 11, 36

UserDict.UserDict class, 24–27.clear() method, 26.__cmp__() method, 24–25.__contains__() method, 25.copy() method, 26.__delitem__() method, 25.get() method, 26.__getitem__() method, 25.has_key() method, 26.items() method, 26–27.iteritems() method, 26–27.iterkeys() method, 27.itervalues() method, 27.keys() method, 27.__len__() method, 26.popitem() method, 27.setdefault() method, 27.__setitem__() method, 26.update() method, 27.values() method, 27

UserInt module, 12USERLEVEL state, 269UserList module, 11

UserList class, 28–32.__add__() method, 29–30.append() method, 32.__contains__() method, 30.count() method, 32.__delitem__() method, 30.__delslice__() method, 30.extend() method, 32.__getitem__() method, 30.__getslice__() method, 30.__hash__() method, 30–31.__iadd__() method, 29–30.__imul__() method, 31.index() method, 32.__len__() method, 31.__mul__() method, 31.pop() method, 32.remove() method, 32.__rmul__() method, 31.__setitem__() method, 31.__setslice__() method, 31.sort() method, 32

UserString module, 11UserString class, 33–34

.__contains__() method, 33–34

.__iadd__() method, 33

.__imul__() method, 33

.__radd__() method, 33UTF-8 encoded files, 468UTF-16 encoded files, 468utf-7 encoding, 187utf-8 encoding, 187utf-16 encoding, 187utf-16-be encoding, 187utf-16-le encoding, 187Utilities, command-line switches, 44–47Utility functions, 271–272, 311–316, 406uu module, 122, 163uu.decode() function, 163uu.encode() function, 163UUencoding, 121–122, 159

VValid XML documents, 401Validating parser, 403Validating XML parser, 413.value attribute, PLY, 335, 336ValueError exception, 135, 141, 380.values() dictionary method, 355, 380van Rossum, Guido, 1, 417Vaults of Parnassus Encryption/Encoding index

Web site, 164Vaults of Parnassus Web site, xv, 112, 203, 398

mertz_final_index.fm Page 517 Monday, May 5, 2003 9:26 AM

Page 34: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

518 INDEX

videoreader module, 105Virtual machine, 418Viruses, 345visitor.cgi script, 379Visual C++ Runtime libraries, 102

WW module, 103Wall, Larry, xiWarning messages, modifying behavior, 109Warnings interpreter, 50–51warnings module, 109waste module, 104WAV audio files, 105wave module, 105W3C Document Object Model, Level 2, 404wc utility, 120W3C XML Schema, 401–402wc.py file, 121weakref module, 109Web application server, 399Web bugs, 378Web clients, 396Web pages, dynamic, 34Web servers

checking for errors, 224–226robots.txt access control file, 285

Webbrowser module, 398Well-formed XML documents, 401wget, 392whichdb module, 93whichdb.whichdb() function, 93while/else/continue/break statements, 438–439Whitespace, 263

regular expressions, 214as single divider, 143

Whitespace character () shortcut, 208, 239Whitespace characters, 312Whitespace compression, 455Whitespace-separated words, 315whrandom module, 109Wichmann-Hill random number generator, 82, 84,

109Widgets for Mac, 103Wildcard character (.""), 239Windows

access to registry, 100alternative path delimiter, 81launching application, 79

Windows-specific functions, 102Windows-style configuration files, 282–283_winreg module, 100winsound module, 104

WinZip, 176Word boundary (\b), 239Word or letter histograms, 123–126Word similarity, 283Word-based Huffman compressed text, 460–461word_huffman module, 460–464wordplusscanner.py file, 331–332Words, counting, 120–121wordscanner.py file, 330word_set characters, 290Working directory, 75World Wide Web applications

accessing Internet resources, 388–394CGI (Common Gateway Interface), 376–383HTML documents, 383–388

WorldScript-Aware Styled Text Engine interface, 104

write() method, 17writelines() method, 17write_payload_list.py file, 361Writer objects, 284Writing

AIFC audio files, 104AIFF audio files, 104gzipped files, 173–175networked applications, 398ZIP files, 176–181

XXDR (eXternal Data Representation), 104xdrlib module, 104XHTML parsers, 384–388XHTML-style empty tag, 388XML documents

canonicalized, 413CDATA, 401comments, 401document type declarations, 401DOM (Document Object Model), 384, 403–

404DTD (Document Type Definition), 401–402event-based API, 404indices of, 409knowledge management, 414–415nodes, 401processing instructions, 401PYX format, 414SAX (Simple API for XML), 404schemas, 402–403transforming into Python objects, 409–410tree of nodes, 403–404valid, 401validating parser, 403

mertz_final_index.fm Page 518 Monday, May 5, 2003 9:26 AM

Page 35: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

INDEX 519

well-formed, 401XML Schema, 401XSLT stylesheets, 403

XML (Extensible Markup Language), 344, 399attributes in XML tags mapping names to val-

ues, 399–400data model, 399–401dialect and DTDs, 261dialects, 399elements, 400nesting tags inside tags, 400–401nodes, 400–401parsing, 407subelements, 400third-party tools, 408–416

xml package, 282XML Schema, 401xmlcat.py file, 406xml.dom module, 384, 404xml.dom.minidom module, 405xml.dom.pulldom module, 405, 413xmllib module, 407xml.parsers.expat module, 405xml_pickle module, 147xmlproc, 413XML-RPC format, 407xmlrpclib module, 407xml.sax module, 384xml.sax package, 405–406xml.sax.handler module, 406xml.sax.saxutils module, 406xml.sax.writers, 413xml.sax.xmlreader module, 4074xpath, 409XPath support, 413xrange() function, 434.xreadlines() method, 17, 37, 434xreadlines module, 72xreadlines.xreadlines() function, 72, 120, 126XSLT

stylesheets, 403support, 413transformations, 415

Yyacc empty productions, 338yacc module, 337–339yacc.errok(), 340yacc.py, 329yacc.restart(), 340YaccSlice object, 332, 338yacc.token(), 340YAML format, 415–416

YAML home page, 409yaml module, 415–416YAML tools, 94yield statement, 439–441YIQ color space, 104

ZZawinski, Jamie, 204Z_BEST_COMPRESSION constant, 182Z_BEST_SPEED constant, 182Zero-case lexer, 261ZIP archives, 177–178ZIP files, 176–181ZIP format, 172–173, 177zip() function, 447, 449zipfile module, 173, 176–181

BadZipFile exception, 181error exception, 181is_zipfile() function, 177PyZipFile class, 177stringCentralDir constant, 177stringEndArchive constant, 177stringFileHeader constant, 177structCentralDir constant, 177structEndArchive constant, 177structFileHeader constant, 177ZIP_DEFLATED symbolic name, 177ZipFile class, 177–179

.close() method, 178

.compression attribute, 179

.debug attribute, 179

.filelist attribute, 179

.filename attribute, 179

.fp attribute, 179

.getinfo() method, 178

.infolist() method, 178

.mode attribute, 179

.namelist() method, 178

.NameToInfo attribute, 179

.printdir() method, 178

.read attribute, 178

.start_dir attribute, 179

.testzip() method, 178

.write() method, 179

.writestr() method, 179ZipInfo class, 178-179

.comment attribute, 180

.compress_size attribute, 180

.compress_type attribute, 180

.CRC attribute, 179

.create_system attribute, 180

.create_version attribute, 180

.date_time attribute, 180

mertz_final_index.fm Page 519 Monday, May 5, 2003 9:26 AM

Page 36: mertz final index€¦ · mertz_final_index.fm Page 489 Monday, May 5, 2003 9:26 AM. 490 I NDEX Decimal numerals, 130, 298 declaration patterns, SimpleParse, 321 Decoding base64 encoding,

520 INDEX

zipfile module, continued.external_attr attribute, 180.extract_version attribute, 180.filename attribute, 180.file_offset attribute, 180.file_size attribute, 180.header_offset attribute, 180.volume attribute, 180

ZIP_STORED symbolic name, 177zlib library, 181–185zlib module, 173, 181–185

adler32() function, 182compress() function, 182compressobj object, 183compressobj.flush() method, 184

crc32() function, 182decompress() function, 182decompressobj object, 183decompressobj.decompress() method, 185decompressobj.flush() method, 185decompressobj.unused_data attribute, 184error exception, 185Z_BEST_COMPRESSION constant, 181Z_BEST_SPEED constant, 181ZLIB_VERSION attribute, 181compressobj.compress() method, 183–184Z_HUFFMAN_ONLY constant, 181

ZODB (Zope Object Database) library, 100, 147Zope home page, 399Zope module, 399

mertz_final_index.fm Page 520 Monday, May 5, 2003 9:26 AM