From c7eaaa83248d9ad6aee14d898824b5191ff5dc7f Mon Sep 17 00:00:00 2001 From: Christine Dodrill Date: Mon, 22 Jul 2019 07:39:03 -0400 Subject: [PATCH] blog: first step of parsing toki pona --- blog/parsing-toki-pona-2019-07-21.markdown | 31 ++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 blog/parsing-toki-pona-2019-07-21.markdown diff --git a/blog/parsing-toki-pona-2019-07-21.markdown b/blog/parsing-toki-pona-2019-07-21.markdown new file mode 100644 index 0000000..bba6f48 --- /dev/null +++ b/blog/parsing-toki-pona-2019-07-21.markdown @@ -0,0 +1,31 @@ +--- +title: Parsing Toki Pona +date: 2019-07-21 +--- + +# Parsing Toki Pona + +Language is annoyingly complicated. English in particular is a nightmare. English +is so hard to understand that humans regularly fail to figure out what other +humans are saying in it. Even if they are native speakers of English, which +usually have a bit of an easier time figuring this stuff out. + +What if there was a language that had less going on? What if it was simple enough +that we could have a _computer_ tokenize, parse and understand it? This post is +an attempt to show that [Toki Pona](http://tokipona.org) is a potential candidate +for this. + +Toki Pona is a constructed/planned language created by the professional translator +Sonja Lang as an attempt to try to break things down to their core essence. Toki +Pona is tiny (only about 120 words depending on who you ask), requiring only a +few days to learn and a month or two to master. Because there are so few words, +many ideas or concepts that normally span multiple words in languages like +English are represented in only one Toki Pona word. + +- basic grammar +- tokenization + - implementation in Nim + - +- talk about future parsing into phrases + - structure of phrase + - implementation in Nim