On 4. Nov 2006, at 10:38, Brett Terpstra wrote:
If I understand RFC 822 correctly, multi-line headers require specific syntax (CRLF followed by a single LWSP and terminated by a CR). I'll leave the details up to you, and I'm probably overlooking something simple, I just don't quite grasp how you're going to make it simple for the average blogger to add an excerpt which can be easily parsed as separate from the message body if there's nothing to terminate it. I'm just curious.
We wouldn’t follow it strictly, just the idea, so e.g. drop everything about CR.
It boils down to: a header starts with a word, then a colon. If the next line has indent, that indent is removed, and the header continues at that line.
So a simple post:
Title: foo Excerpt: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
And here is the actual body.
Or said another way, the header stops with the first non-indented and non-empty line.
A problem though is that we wouldn’t be able to support Markdown for the excerpt, at least not code blocks and hard wrapped list items.
Maybe we should go for the here-doc syntax. So it would be:
Title: foo Excerpt: <<EXCERPT Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. EXCERPT
And here is the actual body.
But then the question is if there should be a way to escape the << used to provide a token. Though probably just better to not allow arbitrary tokens, so the chance of a false positive is highly unlikely.