はじめに
多くのC ++プログラマーは、テストによる開発について聞いています。 しかし、このトピックに関するほとんどすべての資料は、高レベル言語に関連しており、実際よりも一般理論に重点を置いています。 そのため、この記事では、C ++で小さなプロジェクトをテストすることにより、段階的な開発の例を示します。 つまり、名前が示すように、数学的な表現のシンプルなインタープリターです。 このようなプロジェクトは、完成するのに1時間もかからないので(コードについての記事を並行して書いていない限り)、良いコード型でもあります。
建築
TDDを使用すると、アプリケーションアーキテクチャが徐々に現れるという事実にもかかわらず、その最初の開発は依然として必要です。 このため、実装に費やす合計時間を大幅に削減できます。 これは、サンプルとして使用できる類似のシステムの既製の例がある場合に特に効果的です。 この場合、 コンパイラーとインタープリターをどのように配置し、それを使用できるかについて、十分に確立された意見があります。
インタプリタとコンパイラの開発を促進できる多くのライブラリとツールがあります。 Boost.SpiritからANTLRおよびBisonまで。
popen
を介してコマンドラインインタープリターチャネルを実行し、それを介して式を評価することもできます。 この記事の目的は、TDDを使用したかなり複雑なシステムの段階的な開発です。したがって、IDEに組み込まれている標準C ++ライブラリとテストフレームワークのみが使用されます。
まず、単純なインタープリターができることのリストを優先度の高い順に作成します。
- 浮動小数点数と数学演算子で構成される数式の値を計算します(-+ / *)。
- オペレーターの優先順位を考慮に入れます。
- 括弧の説明。
- 単項のプラスとマイナス。
- セミコロン(;)で区切られたいくつかの式の計算。
- 組み込み定数(pi、e)。
- 代入演算子(=)を使用して独自の定数を作成します。
- 可変数の引数を持つ組み込み関数。
- 新しい機能を設定します。
この記事では、最初の3つのポイントのみを実装します。 プロジェクト自体は、概念的に4つの部分で構成されます。
- 字句解析器。 入力文字列をトークンのシーケンスに変換します。
- パーサー 接尾辞表記の形式でトークンから構文表現を構築します。 ソートステーションアルゴリズムを使用して 、再帰やテーブルなしでこれを行います 。
- 電卓。 スタックされたマシンで式の結果を計算します。
- 実際、通訳。 上記のパーツのファサードとして機能します。
ツールキット
このプログラムは、 Visual C ++ Compiler Nov 2013 CTPがインストールされたVisual Studio 2013で作成されます。 テストは、C ++プロジェクトCppUnitTestFrameworkのスタジオに組み込まれたテストフレームワークに基づきます。 単体テスト(Boost.Test、またはCppUTestと比較して)の作成に対するサポートは最小限ですが、一方で、開発環境にうまく統合されています。 別の方法は、C / C ++ Unitプラグインがインストールされ、Boost.Test、GTest、またはQtTestが構成されたEclipseです。 この構成では、clangを使用することをお勧めします。これは、いくつかの強力なコンパイルおよびランタイムアナライザーを提供します。その結果、TDDとともに、コードは完全にエラーの影響を受けなくなります。
それでは、「ネイティブユニットテストプロジェクト」のような新しいプロジェクトを作成し、すべてがコンパイルされることを確認しましょう。
レクサー
レクサーの開発から始めましょう。 通常のTDD Red-Green-Refactorサイクルに従います。
- テストを書いて、落下させます(赤)。
- 彼をパスさせます(緑)。
- 設計を改善します(リファクタリング)。
LexerTests
クラスに配置して、最初のテストを書きましょう。 このような手法をテストのリストとして使用し、次に作成する予定のテストを作成します。 また、現在のテストの作成中にしばしば発生し、すぐに実装することはできません、今後のテストについての考えがそれに入力されます。
- 空の式に応じて、トークンの空のリストが返されます。
私はBDDスタイルでテスト名を書くことに慣れています。 サブジェクトはクラス名で言及されているものを意味するため、各テストは
Should
という単語で始まります。 つまり、Lexerは... Bに応答してAを実行する必要があります。これは、動作の小さな側面にテストの焦点を合わせ、ボリュームの拡大を許可しません。
TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(""); Assert::IsTrue(tokens.empty()); } };
CppUnitTestFrameworkでは、TEST_CLASSマクロはテストメソッドが配置されるクラスを生成します。 マクロ
TEST_METHOD
、それぞれ、テストメソッド自体を作成します。 クラスのインスタンスは、その中のすべてのテストを実行する前に一度だけ作成されることに注意してください。 たとえば、Boost.Testでは、各テストを開始する前に、クラスのインスタンスが毎回新しく作成されます。 したがって、各テストの前に実行する必要があるコードは、
TEST_METHOD_INITIALIZE
マクロを使用して宣言されたメソッドに配置され、その後のテストは
TEST_METHOD_INITIALIZE
配置され
TEST_METHOD_CLEANUP
。 すべてのクレームメソッドは静的であり、
Assert
クラスにあります。 それらは少数ですが、基本的な機能をカバーしています。
テストに戻りましょう。 それはパスしないことではなく、コンパイルさえしません。 文字列を
std::vector, Tokens . .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
を返す
Tokenize
関数を作成します。これは
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
-
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
,std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- ,AreEqual
, , , .
AssertRangenamespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
.wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, ,Assert
,ToString
, .
std::wstring ToString(const Token &)inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
.. …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
,Token
.dynamic_cast
, . , , .std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
…
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, ,union
. .
Tokenclass Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
…..
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
,C
,isdigit
, ,atof
, ,wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . .result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
,_wtof
, . , . , .
..
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. .Detail
.Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizernamespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
.IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h#pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp#include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .
std::vector, Tokens
. .
#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter
, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .
:
({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .
.
inline Tokens Tokenize(std::string expr) { return{}; }
, . .
. . . . . .
, std::string
std::wstring
. , Unicode. .
TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }
AssertRange
- , AreEqual
, , , .
AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange
Operator
. wchar_t
, , .
enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;
, , Assert
, ToString
, .
std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }
, , . , (unconditional → if).
inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }
. . . …
.
TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }
. , , - . , :
, Token
. dynamic_cast
, . , , . std::function
. . Boost.Any, - . .
, . - . , , .
… . . . .
.
enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };
ToString
TokenType
, , . .
TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }
. (constant → scalar) .
class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };
… . . . .
.
TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }
.
class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };
.
TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }
, , union
. .
Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }
, , , . . .
. . . .
:
if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };
, .
… . .
TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }
, C
, isdigit
, , atof
, , wchar_t
. (expression → function). .
inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }
. .
TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }
, . , . . result
, .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }
: (if → while). , .
inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }
wcstod
, _wtof
, . , . , .
. .
.
TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }
(unconditional → if) , .
while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }
. . Detail
. Tokenize
.
inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }
Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail
, . , , . .
TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }
Operator
, .
enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };
. IsOperator
Tokenizer
.
bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }
. .
Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter
InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }
GitHub . , . "__".
. , .