Note: You can propose changes using the forum below.
The idea is to ignore the hard problems of high-level language interpretation and focus on the ones we CAN solve easily. For example, given an API of say 100 functions that have to do with drawing shapes, we should easily be able to map English-text commands like "Draw a circle of radius 5 at 100, 200". And I didn't have to study any manual to figure that out.
Basically the Architecture will consist of all the parts needed to build the following demo:
(1) User is presented with a text command terminal, with several adjacent panes.
(2) The user can type in questions or commands that will typically be one line.
(3) Fuzzy matching of the input string against 1000's languages will determine what the output script should be.
(4) If the match is exact (one and only one context language was matched), then the output code is run.
If the match is fuzzy, then the user is displayed with the likely candidates based on probability and language metrics.
If their is no fuzzy match, we could increase the maximum fuzzy distance of the search, display options to the user for specifying correct output, etc.
(5) The above might include connecting to a User Software repository and running the user's search on the server.
The way of doing interpretation should be left open and will probably depend on domain. But we will provide the means of specifying these languages and mapping input strings to output code. For the purpose of getting a demo running quick, I've come up with an obvious N-gram search of automata, but other methods exist like Levenshtein-Automata: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.652 .
Right now I'm working on a D implementation.
Notes on the command-line demo
+ Text is a limiting feature but our language-handling methods should be applicable to general data.