forked from cadey/xesite
32 lines
1.2 KiB
Markdown
32 lines
1.2 KiB
Markdown
|
---
|
||
|
title: Parsing Toki Pona
|
||
|
date: 2019-07-21
|
||
|
---
|
||
|
|
||
|
# Parsing Toki Pona
|
||
|
|
||
|
Language is annoyingly complicated. English in particular is a nightmare. English
|
||
|
is so hard to understand that humans regularly fail to figure out what other
|
||
|
humans are saying in it. Even if they are native speakers of English, which
|
||
|
usually have a bit of an easier time figuring this stuff out.
|
||
|
|
||
|
What if there was a language that had less going on? What if it was simple enough
|
||
|
that we could have a _computer_ tokenize, parse and understand it? This post is
|
||
|
an attempt to show that [Toki Pona](http://tokipona.org) is a potential candidate
|
||
|
for this.
|
||
|
|
||
|
Toki Pona is a constructed/planned language created by the professional translator
|
||
|
Sonja Lang as an attempt to try to break things down to their core essence. Toki
|
||
|
Pona is tiny (only about 120 words depending on who you ask), requiring only a
|
||
|
few days to learn and a month or two to master. Because there are so few words,
|
||
|
many ideas or concepts that normally span multiple words in languages like
|
||
|
English are represented in only one Toki Pona word.
|
||
|
|
||
|
- basic grammar
|
||
|
- tokenization
|
||
|
- implementation in Nim
|
||
|
-
|
||
|
- talk about future parsing into phrases
|
||
|
- structure of phrase
|
||
|
- implementation in Nim
|