site/blog/parsing-toki-pona-2019-07-2...

32 lines
1.2 KiB
Markdown
Raw Normal View History

2019-07-22 11:39:03 +00:00
---
title: Parsing Toki Pona
date: 2019-07-21
---
# Parsing Toki Pona
Language is annoyingly complicated. English in particular is a nightmare. English
is so hard to understand that humans regularly fail to figure out what other
humans are saying in it. Even if they are native speakers of English, which
usually have a bit of an easier time figuring this stuff out.
What if there was a language that had less going on? What if it was simple enough
that we could have a _computer_ tokenize, parse and understand it? This post is
an attempt to show that [Toki Pona](http://tokipona.org) is a potential candidate
for this.
Toki Pona is a constructed/planned language created by the professional translator
Sonja Lang as an attempt to try to break things down to their core essence. Toki
Pona is tiny (only about 120 words depending on who you ask), requiring only a
few days to learn and a month or two to master. Because there are so few words,
many ideas or concepts that normally span multiple words in languages like
English are represented in only one Toki Pona word.
- basic grammar
- tokenization
- implementation in Nim
-
- talk about future parsing into phrases
- structure of phrase
- implementation in Nim