README.md in zenlish-0.1.10 vs README.md in zenlish-0.1.11
- old
+ new
@@ -26,16 +26,16 @@
Zenlish should be rich enough to express ideas, facts in a fluid way (vs. contrived, artificial way).
Litmus test: a Zenlish text should be easy to read to a English reading person.
### Zenlish as a library (gem)
Over time, the zenlish gem will contain:
-- A tokenizer (tagging, lemmatizer)
+- A tokenizer (tagging, lemmatizer)[TODO]
- A lexicon [STARTED]
- A context-free grammar [STARTED]
- A parser [STARTED]
-- Feature unification (for number, gender agreement)
-- A simplified ontology
+- Feature unification (for number, gender agreement)[TODO]
+- A simplified ontology[TODO]
### What is the purpose of __Zenlish__ ?
With __Zenlish__ it should be possible for a Ruby application to interact with
users with a language that is close enough to English.
@@ -45,45 +45,63 @@
The project is still in inception. Currently, zenlish is able to parse all
sentences of the first lesson.
The intent is to deliver gem versions in small increments.
-#### Some project metrics (v. 0.1.10)
+#### Some project metrics (v. 0.1.11)
|Metric|Value|
|:-:|:-:|
-| Number of lemmas in lexicon | 61 |
-| [Coverage 100 commonest English words](https://en.wikipedia.org/wiki/Most_common_words_in_English) | 30 |
-| Number of production rules in grammar | 85 |
-| Number of lessons covered | 10 |
-| Number of sentences in spec files | 87 |
+| Number of lemmas in lexicon | 69 |
+| [Coverage 100 commonest English words](https://en.wikipedia.org/wiki/Most_common_words_in_English) | 34 |
+| Number of production rules in grammar | 95 |
+| Number of lessons covered | 11 |
+| Number of sentences in spec files | 98 |
### Roadmap
Here a tentative roadmap:
-#### A) Support vocabulary and sentences from [Learn These Words First](http://learnthesewordsfirst.com/)
+#### A) Ability to parse sentences from [Learn These Words First](http://learnthesewordsfirst.com/)
+*STARTED*. 11% complete
This website advocates the idea of a multi-layered dictionary.
At the core, there are about 300 essential words.
The choice of these words is inspired by the semantic primitives of [NSM
(Natural Semantic Metalanguage)](https://en.wikipedia.org/wiki/Natural_semantic_metalanguage).
The essential words are introduced in twelve lessons. Each lesson put the words
in exemplar sentences and pictures.
-The project sub-goals are:
+The milestone sub-goals are:
- To inject the 300 core words into Zenlish lexicon,
- Zenlish should be able to parse all the example sentences
-- Also Zenlish should determine the semantics (i.e. meaning) of the sentences
-#### B) Capability to read a complete book
+#### B) Associate lexical features to terms in lexicon
+The sub-goals are:
+- To enrich the lexicon entries with lexical and syntactical features.
+- Zenlish should be able to derive the declensions of nouns, conjugation of verbs,
+- Also Zenlish should detect agreement errors
+- Ideally, Zenlish should have a lemmatizer
+
+#### C) Enrich lexicon entries with semantical features and relationships
+The sub-goals are:
+- To enrich the lexicon entries with lexical and syntactical features.
+- Zenlish should be able to derive the declensions of nouns, conjugation of verbs,
+- Also Zenlish should detect agreement errors
+
+#### D) Build a generic ontology and map Zenlish text to it.
+The sub-goals are:
+- To have a simplified ontology that covers the concepts covered in the lesson sentences.
+- Hopefully Zenlish should be answer to queries related to the lesson sentences.
+
+#### E) Capability to parse a complete book
A good candidate book is "The Edge of the Sky" by Roberto Trotta (ISBN 978-0-465-04471-9 : hardcover, ISBN 978-0-465-04490-0 : ebook).
Professor Trotta challenged himself by writing a book on Cosmology with the 1000 most used words. More details [here](http://robertotrotta.com/the-edge-of-the-sky/).
In order to achieve this goal, Zenlish should:
- Incorporate the 1000 words in its lexicon
- Have a grammar that allows the parsing of the sentences in the book.
-#### C) Capability to interpret the meaning of a complete book
+#### F) Capability to interpret the meaning of a complete book
Probably, far-fetched. But it will be nice to launch query to Zenlish to check if
it has some understanding of the text it reads (i.e. has a semantic representation).