-#include <mts/generic/mts.h>
-
-/* This module has a long and checkered history. First, it didn't burst
- maildrops correctly because it considered two CTRL-A:s in a row to be
- an inter-message delimiter. It really is four CTRL-A:s followed by a
- newline. Unfortunately, MMDF will convert this delimiter *inside* a
- message to a CTRL-B followed by three CTRL-A:s and a newline. This
- caused the old version of m_getfld() to declare eom prematurely. The
- fix was a lot slower than
-
- c == '\001' && peekc (iob) == '\001'
-
- but it worked, and to increase generality, MBOX style maildrops could
- be parsed as well. Unfortunately the speed issue finally caught up with
- us since this routine is at the very heart of MH.
-
- To speed things up considerably, the routine Eom() was made an auxilary
- function called by the macro eom(). Unless we are bursting a maildrop,
- the eom() macro returns FALSE saying we aren't at the end of the
- message.
-
- The next thing to do is to read the mts.conf file and initialize
- delimiter[] and delimlen accordingly...
-
- After mhl was made a built-in in msh, m_getfld() worked just fine
- (using m_unknown() at startup). Until one day: a message which was
- the result of a bursting was shown. Then, since the burst boundaries
- aren't CTRL-A:s, m_getfld() would blinding plunge on past the boundary.
- Very sad. The solution: introduce m_eomsbr(). This hook gets called
- after the end of each line (since testing for eom involves an fseek()).
- This worked fine, until one day: a message with no body portion arrived.
- Then the
-
- while (eom (c = Getc (iob), iob))
- continue;
-
- loop caused m_getfld() to return FMTERR. So, that logic was changed to
- check for (*eom_action) and act accordingly.
-
- This worked fine, until one day: someone didn't use four CTRL:A's as
- their delimiters. So, the bullet got bit and we read mts.h and
- continue to struggle on. It's not that bad though, since the only time
- the code gets executed is when inc (or msh) calls it, and both of these
- have already called mts_init().
-
- ------------------------
- (Written by Van Jacobson for the mh6 m_getfld, January, 1986):
-
- This routine was accounting for 60% of the cpu time used by most mh
- programs. I spent a bit of time tuning and it now accounts for <10%
- of the time used. Like any heavily tuned routine, it's a bit
- complex and you want to be sure you understand everything that it's
- doing before you start hacking on it. Let me try to emphasize
- that: every line in this atrocity depends on every other line,
- sometimes in subtle ways. You should understand it all, in detail,
- before trying to change any part. If you do change it, test the
- result thoroughly (I use a hand-constructed test file that exercises
- all the ways a header name, header body, header continuation,
- header-body separator, body line and body eom can align themselves
- with respect to a buffer boundary). "Minor" bugs in this routine
- result in garbaged or lost mail.
-
- If you hack on this and slow it down, I, my children and my
- children's children will curse you.
-
- This routine gets used on three different types of files: normal,
- single msg files, "packed" unix or mmdf mailboxs (when used by inc)
- and packed, directoried bulletin board files (when used by msh).
- The biggest impact of different file types is in "eom" testing. The
- code has been carefully organized to test for eom at appropriate
- times and at no other times (since the check is quite expensive).
- I have tried to arrange things so that the eom check need only be
- done on entry to this routine. Since an eom can only occur after a
- newline, this is easy to manage for header fields. For the msg
- body, we try to efficiently search the input buffer to see if
- contains the eom delimiter. If it does, we take up to the
- delimiter, otherwise we take everything in the buffer. (The change
- to the body eom/copy processing produced the most noticeable
- performance difference, particularly for "inc" and "show".)
-
- There are three qualitatively different things this routine busts
- out of a message: field names, field text and msg bodies. Field
- names are typically short (~8 char) and the loop that extracts them
- might terminate on a colon, newline or max width. I considered
- using a Vax "scanc" to locate the end of the field followed by a
- "bcopy" but the routine call overhead on a Vax is too large for this
- to work on short names. If Berkeley ever makes "inline" part of the
- C optimiser (so things like "scanc" turn into inline instructions) a
- change here would be worthwhile.
-
- Field text is typically 60 - 100 characters so there's (barely)
- a win in doing a routine call to something that does a "locc"
- followed by a "bmove". About 30% of the fields have continuations
- (usually the 822 "received:" lines) and each continuation generates
- another routine call. "Inline" would be a big win here, as well.
-
- Messages, as of this writing, seem to come in two flavors: small
- (~1K) and long (>2K). Most messages have 400 - 600 bytes of headers
- so message bodies average at least a few hundred characters.
- Assuming your system uses reasonably sized stdio buffers (1K or
- more), this routine should be able to remove the body in large
- (>500 byte) chunks. The makes the cost of a call to "bcopy"
- small but there is a premium on checking for the eom in packed
- maildrops. The eom pattern is always a simple string so we can
- construct an efficient pattern matcher for it (e.g., a Vax "matchc"
- instruction). Some thought went into recognizing the start of
- an eom that has been split across two buffers.
-
- This routine wants to deal with large chunks of data so, rather
- than "getc" into a local buffer, it uses stdio's buffer. If
- you try to use it on a non-buffered file, you'll get what you
- deserve. This routine "knows" that struct FILEs have a _ptr
- and a _cnt to describe the current state of the buffer and
- it knows that _filbuf ignores the _ptr & _cnt and simply fills
- the buffer. If stdio on your system doesn't work this way, you
- may have to make small changes in this routine.
-
- This routine also "knows" that an EOF indication on a stream is
- "sticky" (i.e., you will keep getting EOF until you reposition the
- stream). If your system doesn't work this way it is broken and you
- should complain to the vendor. As a consequence of the sticky
- EOF, this routine will never return any kind of EOF status when
- there is data in "name" or "buf").
- */
+#include <h/mts.h>
+#include <h/utils.h>
+
+/*
+** This module has a long and checkered history. First, it didn't burst
+** maildrops correctly because it considered two CTRL-A:s in a row to be
+** an inter-message delimiter. It really is four CTRL-A:s followed by a
+** newline. Unfortunately, MMDF will convert this delimiter *inside* a
+** message to a CTRL-B followed by three CTRL-A:s and a newline. This
+** caused the old version of m_getfld() to declare eom prematurely. The
+** fix was a lot slower than
+**
+** c == '\001' && peekc (iob) == '\001'
+**
+** but it worked, and to increase generality, MBOX style maildrops could
+** be parsed as well. Unfortunately the speed issue finally caught up with
+** us since this routine is at the very heart of MH.
+**
+** To speed things up considerably, the routine Eom() was made an auxilary
+** function called by the macro eom(). Unless we are bursting a maildrop,
+** the eom() macro returns FALSE saying we aren't at the end of the
+** message.
+**
+** The next thing to do is to read the mts.conf file and initialize
+** delimiter[] and delimlen accordingly...
+**
+** After mhl was made a built-in in msh, m_getfld() worked just fine
+** (using m_unknown() at startup). Until one day: a message which was
+** the result of a bursting was shown. Then, since the burst boundaries
+** aren't CTRL-A:s, m_getfld() would blinding plunge on past the boundary.
+** Very sad. The solution: introduce m_eomsbr(). This hook gets called
+** after the end of each line (since testing for eom involves an fseek()).
+** This worked fine, until one day: a message with no body portion arrived.
+** Then the
+**
+** while (eom(c = getc(iob), iob))
+** continue;
+**
+** loop caused m_getfld() to return FMTERR. So, that logic was changed to
+** check for (*eom_action) and act accordingly.
+**
+** [ Note by meillo 2011-10:
+** as msh was removed from mmh, m_eomsbr() became irrelevant. ]
+**
+** This worked fine, until one day: someone didn't use four CTRL:A's as
+** their delimiters. So, the bullet got bit and we read mts.h and
+** continue to struggle on. It's not that bad though, since the only time
+** the code gets executed is when inc (or msh) calls it, and both of these
+** have already called mts_init().
+**
+** ------------------------
+** (Written by Van Jacobson for the mh6 m_getfld, January, 1986):
+**
+** This routine was accounting for 60% of the cpu time used by most mh
+** programs. I spent a bit of time tuning and it now accounts for <10%
+** of the time used. Like any heavily tuned routine, it's a bit
+** complex and you want to be sure you understand everything that it's
+** doing before you start hacking on it. Let me try to emphasize
+** that: every line in this atrocity depends on every other line,
+** sometimes in subtle ways. You should understand it all, in detail,
+** before trying to change any part. If you do change it, test the
+** result thoroughly (I use a hand-constructed test file that exercises
+** all the ways a header name, header body, header continuation,
+** header-body separator, body line and body eom can align themselves
+** with respect to a buffer boundary). "Minor" bugs in this routine
+** result in garbaged or lost mail.
+**
+** If you hack on this and slow it down, I, my children and my
+** children's children will curse you.
+**
+** This routine gets used on three different types of files: normal,
+** single msg files, "packed" unix or mmdf mailboxs (when used by inc)
+** and packed, directoried bulletin board files (when used by msh).
+** The biggest impact of different file types is in "eom" testing. The
+** code has been carefully organized to test for eom at appropriate
+** times and at no other times (since the check is quite expensive).
+** I have tried to arrange things so that the eom check need only be
+** done on entry to this routine. Since an eom can only occur after a
+** newline, this is easy to manage for header fields. For the msg
+** body, we try to efficiently search the input buffer to see if
+** contains the eom delimiter. If it does, we take up to the
+** delimiter, otherwise we take everything in the buffer. (The change
+** to the body eom/copy processing produced the most noticeable
+** performance difference, particularly for "inc" and "show".)
+**
+** There are three qualitatively different things this routine busts
+** out of a message: field names, field text and msg bodies. Field
+** names are typically short (~8 char) and the loop that extracts them
+** might terminate on a colon, newline or max width. I considered
+** using a Vax "scanc" to locate the end of the field followed by a
+** "bcopy" but the routine call overhead on a Vax is too large for this
+** to work on short names. If Berkeley ever makes "inline" part of the
+** C optimiser (so things like "scanc" turn into inline instructions) a
+** change here would be worthwhile.
+**
+** Field text is typically 60 - 100 characters so there's (barely)
+** a win in doing a routine call to something that does a "locc"
+** followed by a "bmove". About 30% of the fields have continuations
+** (usually the 822 "received:" lines) and each continuation generates
+** another routine call. "Inline" would be a big win here, as well.
+**
+** Messages, as of this writing, seem to come in two flavors: small
+** (~1K) and long (>2K). Most messages have 400 - 600 bytes of headers
+** so message bodies average at least a few hundred characters.
+** Assuming your system uses reasonably sized stdio buffers (1K or
+** more), this routine should be able to remove the body in large
+** (>500 byte) chunks. The makes the cost of a call to "bcopy"
+** small but there is a premium on checking for the eom in packed
+** maildrops. The eom pattern is always a simple string so we can
+** construct an efficient pattern matcher for it (e.g., a Vax "matchc"
+** instruction). Some thought went into recognizing the start of
+** an eom that has been split across two buffers.
+**
+** This routine wants to deal with large chunks of data so, rather
+** than "getc" into a local buffer, it uses stdio's buffer. If
+** you try to use it on a non-buffered file, you'll get what you
+** deserve. This routine "knows" that struct FILEs have a _ptr
+** and a _cnt to describe the current state of the buffer and
+** it knows that _filbuf ignores the _ptr & _cnt and simply fills
+** the buffer. If stdio on your system doesn't work this way, you
+** may have to make small changes in this routine.
+**
+** This routine also "knows" that an EOF indication on a stream is
+** "sticky" (i.e., you will keep getting EOF until you reposition the
+** stream). If your system doesn't work this way it is broken and you
+** should complain to the vendor. As a consequence of the sticky
+** EOF, this routine will never return any kind of EOF status when
+** there is data in "name" or "buf").
+*/