← Index
NYTProf Performance Profile   « line view »
For /usr/local/libexec/sympa/task_manager-debug.pl
  Run on Tue Jun 1 22:32:51 2021
Reported on Tue Jun 1 22:35:12 2021

Filename/usr/local/lib/perl5/site_perl/MIME/Words.pm
StatementsExecuted 0 statements in 0s
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
0000s0sMIME::Words::::BEGIN@69MIME::Words::BEGIN@69
0000s0sMIME::Words::::BEGIN@70MIME::Words::BEGIN@70
0000s0sMIME::Words::::BEGIN@71MIME::Words::BEGIN@71
0000s0sMIME::Words::::BEGIN@74MIME::Words::BEGIN@74
0000s0sMIME::Words::::BEGIN@85MIME::Words::BEGIN@85
0000s0sMIME::Words::::BEGIN@86MIME::Words::BEGIN@86
0000s0sMIME::Words::::_decode_BMIME::Words::_decode_B
0000s0sMIME::Words::::_decode_QMIME::Words::_decode_Q
0000s0sMIME::Words::::_encode_BMIME::Words::_encode_B
0000s0sMIME::Words::::_encode_QMIME::Words::_encode_Q
0000s0sMIME::Words::::decode_mimewordsMIME::Words::decode_mimewords
0000s0sMIME::Words::::encode_mimewordMIME::Words::encode_mimeword
0000s0sMIME::Words::::encode_mimewordsMIME::Words::encode_mimewords
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1package MIME::Words;
2
3=head1 NAME
4
5MIME::Words - deal with RFC 2047 encoded words
6
7
8=head1 SYNOPSIS
9
10Before reading further, you should see L<MIME::Tools> to make sure that
11you understand where this module fits into the grand scheme of things.
12Go on, do it now. I'll wait.
13
14Ready? Ok...
15
16
17 use MIME::Words qw(:all);
18
19 ### Decode the string into another string, forgetting the charsets:
20 $decoded = decode_mimewords(
21 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>',
22 );
23
24 ### Split string into array of decoded [DATA,CHARSET] pairs:
25 @decoded = decode_mimewords(
26 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>',
27 );
28
29 ### Encode a single unsafe word:
30 $encoded = encode_mimeword("\xABFran\xE7ois\xBB");
31
32 ### Encode a string, trying to find the unsafe words inside it:
33 $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB in town");
34
- -
37=head1 DESCRIPTION
38
39Fellow Americans, you probably won't know what the hell this module
40is for. Europeans, Russians, et al, you probably do. C<:-)>.
41
42For example, here's a valid MIME header you might get:
43
44 From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
45 To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
46 CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
47 Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
48 =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
49 =?US-ASCII?Q?.._cool!?=
50
51The fields basically decode to (sorry, I can only approximate the
52Latin characters with 7 bit sequences /o and 'e):
53
54 From: Keith Moore <moore@cs.utk.edu>
55 To: Keld J/orn Simonsen <keld@dkuug.dk>
56 CC: Andr'e Pirard <PIRARD@vm1.ulg.ac.be>
57 Subject: If you can read this you understand the example... cool!
58
59
60=head1 PUBLIC INTERFACE
61
62=over 4
63
64=cut
65
66require 5.001;
67
68### Pragmas:
69use strict;
70use re 'taint';
71use vars qw($VERSION @EXPORT_OK %EXPORT_TAGS @ISA);
72
73### Exporting:
74use Exporter;
75%EXPORT_TAGS = (all => [qw(decode_mimewords
76 encode_mimeword
77 encode_mimewords
78 )]);
79Exporter::export_ok_tags('all');
80
81### Inheritance:
82@ISA = qw(Exporter);
83
84### Other modules:
85use MIME::Base64;
86use MIME::QuotedPrint;
87
- -
90#------------------------------
91#
92# Globals...
93#
94#------------------------------
95
96### The package version, both in 1.23 style *and* usable by MakeMaker:
97$VERSION = "5.509";
98
99### Nonprintables (controls + x7F + 8bit):
100my $NONPRINT = "\\x00-\\x1F\\x7F-\\xFF";
101
102
103#------------------------------
104
105# _decode_Q STRING
106# Private: used by _decode_header() to decode "Q" encoding, which is
107# almost, but not exactly, quoted-printable. :-P
108sub _decode_Q {
109 my $str = shift;
110 local $1;
111 $str =~ s/_/\x20/g; # RFC-1522, Q rule 2
112 $str =~ s/=([\da-fA-F]{2})/pack("C", hex($1))/ge; # RFC-1522, Q rule 1
113 $str;
114}
115
116# _encode_Q STRING
117# Private: used by _encode_header() to decode "Q" encoding, which is
118# almost, but not exactly, quoted-printable. :-P
119sub _encode_Q {
120 my $str = shift;
121 local $1;
122 $str =~ s{([ _\?\=$NONPRINT])}{sprintf("=%02X", ord($1))}eog;
123 $str;
124}
125
126# _decode_B STRING
127# Private: used by _decode_header() to decode "B" encoding.
128sub _decode_B {
129 my $str = shift;
130 decode_base64($str);
131}
132
133# _encode_B STRING
134# Private: used by _decode_header() to decode "B" encoding.
135sub _encode_B {
136 my $str = shift;
137 encode_base64($str, '');
138}
139
- -
142#------------------------------
143
144=item decode_mimewords ENCODED
145
146I<Function.>
147Go through the string looking for RFC 2047-style "Q"
148(quoted-printable, sort of) or "B" (base64) encoding, and decode them.
149
150B<In an array context,> splits the ENCODED string into a list of decoded
151C<[DATA, CHARSET]> pairs, and returns that list. Unencoded
152data are returned in a 1-element array C<[DATA]>, giving an effective
153CHARSET of C<undef>.
154
155 $enc = '=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>';
156 foreach (decode_mimewords($enc)) {
157 print "", ($_->[1] || 'US-ASCII'), ": ", $_->[0], "\n";
158 }
159
160B<In a scalar context,> joins the "data" elements of the above
161list together, and returns that. I<Warning: this is information-lossy,>
162and probably I<not> what you want, but if you know that all charsets
163in the ENCODED string are identical, it might be useful to you.
164(Before you use this, please see L<MIME::WordDecoder/unmime>,
165which is probably what you want.)
166
167In the event of a syntax error, $@ will be set to a description
168of the error, but parsing will continue as best as possible (so as to
169get I<something> back when decoding headers).
170$@ will be false if no error was detected.
171
172Any arguments past the ENCODED string are taken to define a hash of options:
173
174=cut
175
176sub decode_mimewords {
177 my $encstr = shift;
178 my @tokens;
179 local($1,$2,$3);
180 $@ = ''; ### error-return
181
182 ### Collapse boundaries between adjacent encoded words:
183 $encstr =~ s{(\?\=)\s*(\=\?)}{$1$2}gs;
184 pos($encstr) = 0;
185 ### print STDOUT "ENC = [", $encstr, "]\n";
186
187 ### Decode:
188 my ($charset, $encoding, $enc, $dec);
189 while (1) {
190 last if (pos($encstr) >= length($encstr));
191 my $pos = pos($encstr); ### save it
192
193 ### Case 1: are we looking at "=?..?..?="?
194 if ($encstr =~ m{\G # from where we left off..
195 =\?([^?]*) # "=?" + charset +
196 \?([bq]) # "?" + encoding +
197 \?([^?]+) # "?" + data maybe with spcs +
198 \?= # "?="
199 }xgi) {
200 ($charset, $encoding, $enc) = ($1, lc($2), $3);
201 $dec = (($encoding eq 'q') ? _decode_Q($enc) : _decode_B($enc));
202 push @tokens, [$dec, $charset];
203 next;
204 }
205
206 ### Case 2: are we looking at a bad "=?..." prefix?
207 ### We need this to detect problems for case 3, which stops at "=?":
208 pos($encstr) = $pos; # reset the pointer.
209 if ($encstr =~ m{\G=\?}xg) {
210 $@ .= qq|unterminated "=?..?..?=" in "$encstr" (pos $pos)\n|;
211 push @tokens, ['=?'];
212 next;
213 }
214
215 ### Case 3: are we looking at ordinary text?
216 pos($encstr) = $pos; # reset the pointer.
217 if ($encstr =~ m{\G # from where we left off...
218 (.*? # shortest possible string,
219 \n*) # followed by 0 or more NLs,
220 (?=(\Z|=\?)) # terminated by "=?" or EOS
221 }sxg) {
222 length($1) or die "MIME::Words: internal logic err: empty token\n";
223 push @tokens, [$1];
224 next;
225 }
226
227 ### Case 4: bug!
228 die "MIME::Words: unexpected case:\n($encstr) pos $pos\n\t".
229 "Please alert developer.\n";
230 }
231 return (wantarray ? @tokens : join('',map {$_->[0]} @tokens));
232}
233
234#------------------------------
235
236=item encode_mimeword RAW, [ENCODING], [CHARSET]
237
238I<Function.>
239Encode a single RAW "word" that has unsafe characters.
240The "word" will be encoded in its entirety.
241
242 ### Encode "<<Franc,ois>>":
243 $encoded = encode_mimeword("\xABFran\xE7ois\xBB");
244
245You may specify the ENCODING (C<"Q"> or C<"B">), which defaults to C<"Q">.
246You may specify the CHARSET, which defaults to C<iso-8859-1>.
247
248=cut
249
250sub encode_mimeword {
251 my $word = shift;
252 my $encoding = uc(shift || 'Q');
253 my $charset = uc(shift || 'ISO-8859-1');
254 my $encfunc = (($encoding eq 'Q') ? \&_encode_Q : \&_encode_B);
255 "=?$charset?$encoding?" . &$encfunc($word) . "?=";
256}
257
258#------------------------------
259
260=item encode_mimewords RAW, [OPTS]
261
262I<Function.>
263Given a RAW string, try to find and encode all "unsafe" sequences
264of characters:
265
266 ### Encode a string with some unsafe "words":
267 $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB");
268
269Returns the encoded string.
270Any arguments past the RAW string are taken to define a hash of options:
271
272=over 4
273
274=item Charset
275
276Encode all unsafe stuff with this charset. Default is 'ISO-8859-1',
277a.k.a. "Latin-1".
278
279=item Encoding
280
281The encoding to use, C<"q"> or C<"b">. The default is C<"q">.
282
283=back
284
285B<Warning:> this is a quick-and-dirty solution, intended for character
286sets which overlap ASCII. B<It does not comply with the RFC 2047
287rules regarding the use of encoded words in message headers>.
288You may want to roll your own variant,
289using C<encode_mimeword()>, for your application.
290I<Thanks to Jan Kasprzak for reminding me about this problem.>
291
292=cut
293
294sub encode_mimewords {
295 my ($rawstr, %params) = @_;
296 my $charset = $params{Charset} || 'ISO-8859-1';
297 my $encoding = lc($params{Encoding} || 'q');
298
299 ### Encode any "words" with unsafe characters.
300 ### We limit such words to 18 characters, to guarantee that the
301 ### worst-case encoding give us no more than 54 + ~10 < 75 characters
302 my $word;
303 local $1;
304 $rawstr =~ s{([a-zA-Z0-9\x7F-\xFF]+\s*)}{ ### get next "word"
305 $word = $1;
306 (($word !~ /(?:[$NONPRINT])|(?:^\s+$)/o)
307 ? $word ### no unsafe chars
308 : encode_mimeword($word, $encoding, $charset)); ### has unsafe chars
309 }xeg;
310 $rawstr =~ s/\?==\?/?= =?/g;
311 $rawstr;
312}
313
3141;
315__END__