TDDを使用した簡単なC ++インタープリターの作成、パート1

はじめに





多くのC ++プログラマーは、テストによる開発について聞いています。 しかし、このトピックに関するほとんどすべての資料は、高レベル言語に関連しており、実際よりも一般理論に重点を置いています。 そのため、この記事では、C ++で小さなプロジェクトをテストすることにより、段階的な開発の例を示します。 つまり、名前が示すように、数学的な表現のシンプルなインタープリターです。 このようなプロジェクトは、完成するのに1時間もかからないので(コードについての記事を並行して書いていない限り)、良いコード型でもあります。



建築



TDDを使用すると、アプリケーションアーキテクチャが徐々に現れるという事実にもかかわらず、その最初の開発は依然として必要です。 このため、実装に費やす合計時間を大幅に削減できます。 これは、サンプルとして使用できる類似のシステムの既製の例がある場合に特に効果的です。 この場合、 コンパイラーとインタープリターをどのように配置し、それを使用できるかについて、十分に確立された意見があります。



インタプリタとコンパイラの開発を促進できる多くのライブラリとツールがあります。 Boost.SpiritからANTLRおよびBisonまで。 popen



を介してコマンドラインインタープリターチャネルを実行し、それを介して式を評価することもできます。 この記事の目的は、TDDを使用したかなり複雑なシステムの段階的な開発です。したがって、IDEに組み込まれている標準C ++ライブラリとテストフレームワークのみが使用されます。



まず、単純なインタープリターができることのリストを優先度の高い順に作成します。





この記事では、最初の3つのポイントのみを実装します。 プロジェクト自体は、概念的に4つの部分で構成されます。





ツールキット



このプログラムは、 Visual C ++ Compiler Nov 2013 CTPがインストールされたVisual Studio 2013で作成されます。 テストは、C ++プロジェクトCppUnitTestFrameworkのスタジオに組み込まれたテストフレームワークに基づきます。 単体テスト(Boost.Test、またはCppUTestと比較して)の作成に対するサポートは最小限ですが、一方で、開発環境にうまく統合されています。 別の方法は、C / C ++ Unitプラグインがインストールされ、Boost.Test、GTest、またはQtTestが構成されたEclipseです。 この構成では、clangを使用することをお勧めします。これは、いくつかの強力なコンパイルおよびランタイムアナライザーを提供します。その結果、TDDとともに、コードは完全にエラーの影響を受けなくなります。



それでは、「ネイティブユニットテストプロジェクト」のような新しいプロジェクトを作成し、すべてがコンパイルされることを確認しましょう。



レクサー



レクサーの開発から始めましょう。 通常のTDD Red-Green-Refactorサイクルに従います。



  1. テストを書いて、落下させます(赤)。
  2. 彼をパスさせます(緑)。
  3. 設計を改善します(リファクタリング)。


LexerTests



クラスに配置して、最初のテストを書きましょう。 このような手法をテストのリストとして使用し、次に作成する予定のテストを作成します。 また、現在のテストの作成中にしばしば発生し、すぐに実装することはできません、今後のテストについての考えがそれに入力されます。





私はBDDスタイルでテスト名を書くことに慣れています。 サブジェクトはクラス名で言及されているものを意味するため、各テストはShould



という単語で始まります。 つまり、Lexerは... Bに応答してAを実行する必要があります。これは、動作の小さな側面にテストの焦点を合わせ、ボリュームの拡大を許可しません。



 TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(""); Assert::IsTrue(tokens.empty()); } };
      
      





CppUnitTestFrameworkでは、TEST_CLASSマクロはテストメソッドが配置されるクラスを生成します。 マクロTEST_METHOD



、それぞれ、テストメソッド自体を作成します。 クラスのインスタンスは、その中のすべてのテストを実行する前に一度だけ作成されることに注意してください。 たとえば、Boost.Testでは、各テストを開始する前に、クラスのインスタンスが毎回新しく作成されます。 したがって、各テストの前に実行する必要があるコードは、 TEST_METHOD_INITIALIZE



マクロを使用して宣言されたメソッドに配置され、その後のテストはTEST_METHOD_INITIALIZE



配置されTEST_METHOD_CLEANUP



。 すべてのクレームメソッドは静的であり、 Assert



クラスにあります。 それらは少数ですが、基本的な機能をカバーしています。



テストに戻りましょう。 それはパスしないことではなく、コンパイルさえしません。 文字列をstd::vector, Tokens . .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .








を返すTokenize



関数を作成します。これはstd::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




  1. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  2. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  3. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  4. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  5. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  6. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  7. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  8. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  9. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  10. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  11. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




  12. std::vector, Tokens



    . .



    #pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





    , , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



    :



    ({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

    .



    inline Tokens Tokenize(std::string expr) { return{}; }





    , . .



    . . . . . .

    , std::string



    std::wstring



    . , Unicode. .



    TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





    AssertRange



    - , AreEqual



    , , , .



    AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







    Operator



    . wchar_t



    , , .



    enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





    , , Assert



    , ToString



    , .



    std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







    , , . , (unconditional → if).



    inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





    . . . …

    .



    TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





    . , , - . , :



    , Token



    . dynamic_cast



    , . , , . std::function



    . . Boost.Any, - . .

    , . - . , , .



    … . . . .

    .



    enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





    ToString



    TokenType



    , , . .



    TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





    . (constant → scalar) .



    class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





    . . . .

    .



    TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





    .



    class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





    .



    TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





    , , union



    . .



    Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







    , , , . . .



    . . . .

    :



    if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





    , .



    . .

    TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





    , C



    , isdigit



    , , atof



    , , wchar_t



    . (expression → function). .



    inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





    . .



    TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





    , . , . . result



    , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





    : (if → while). , .



    inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





    wcstod



    , _wtof



    , . , . , .



    . .

    .



    TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





    (unconditional → if) , .



    while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





    . . Detail



    . Tokenize



    .



    inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





    Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







    , . , , . .



    TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





    Operator



    , .



    enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





    . IsOperator



    Tokenizer



    .



    bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





    . .



    Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







    InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







    GitHub . , . "__".



    . , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }

, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .








std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




 std::vector,      Tokens
      
      



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (ifwhile) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (ifwhile). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .




std::vector, Tokens



. .



#pragma once; #include <vector> namespace Interpreter { struct Token {}; typedef std::vector<Token> Tokens; namespace Lexer { inline Tokens Tokenize(std::string expr) { throw std::exception(); } } // namespace Lexer } // namespace Interpreter





, , , . , , The Transformation Priority Premise (TPP) . , , , , . . , , , . , , ( ) , , . , , . , TPP .



:



({} → nil) , . (nil → constant) . (constant → constant+) ( , ). (constant → scalar) , . (statement → statements) (break, continue, return ). (unconditional → if) . (scalar → array) / . (array → container) . (statement → recursion) . (if → while) . (expression → function) . (variable → assignment) .

.



inline Tokens Tokenize(std::string expr) { return{}; }





, . .



. . . . . .

, std::string



std::wstring



. , Unicode. .



TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); }





AssertRange



- , AreEqual



, , , .



AssertRange namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange







Operator



. wchar_t



, , .



enum class Operator : wchar_t { Plus = L'+', }; typedef Operator Token;





, , Assert



, ToString



, .



std::wstring ToString(const Token &) inline std::wstring ToString(const Token &token) { return{ static_cast<wchar_t>(token) }; }







, , . , (unconditional → if).



inline Tokens Tokenize(std::wstring expr) { if(expr.empty()) { return{}; } return{ static_cast<Operator>(expr[0]) }; }





. . . …

.



TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); }





. , , - . , :



, Token



. dynamic_cast



, . , , . std::function



. . Boost.Any, - . .

, . - . , , .



… . . . .

.



enum class TokenType { Operator, Number }; class Token { public: Token(Operator) {} TokenType Type() const { return TokenType::Operator; } }; … TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } };





ToString



TokenType



, , . .



TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); }





. (constant → scalar) .



class Token { public: Token(Operator) :m_type(TokenType::Operator) {} Token(double) :m_type(TokenType::Number) {} TokenType Type() const { return m_type; } private: TokenType m_type; };





. . . .

.



TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); }





.



class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} operator Operator() const { return m_operator; } … Operator m_operator; };





.



TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); }





, , union



. .



Token class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: return "Unknown token."; } }







, , , . . .



. . . .

:



if(expr[0] >= '0' && expr[0] <= '9') { return{ (double) expr[0] - '0' }; } return{ static_cast<Operator>(expr[0]) };





, .



. .

TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); }





, C



, isdigit



, , atof



, , wchar_t



. (expression → function). .



inline Tokens Tokenize(std::wstring expr) { const wchar_t *current = expr.c_str(); if(!*current) return{}; if(iswdigit(*current)) return{ _wtof(current) }; return{ static_cast<Operator>(*current) }; }





. .



TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); }





, . , . . result



, .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); if(!*current) return result; if(iswdigit(*current)) { result.push_back(_wtof(current)); } else { result.push_back(static_cast<Operator>(*current)); } return result; }





: (if → while). , .



inline Tokens Tokenize(std::wstring expr) { Tokens result; const wchar_t *current = expr.c_str(); while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else { result.push_back(static_cast<Operator>(*current)); ++current; } } return result; }





wcstod



, _wtof



, . , . , .



. .

.



TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); }





(unconditional → if) , .



while(*current) { if(iswdigit(*current)) { wchar_t *end = nullptr; result.push_back(wcstod(current, &end)); current = end; } else if(*current == static_cast<wchar_t>(Operator::Plus)) { result.push_back(static_cast<Operator>(*current)); ++current; } else { ++current; } }





. . Detail



. Tokenize



.



inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); }





Detail::Tokenizer namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { return *m_current == static_cast<wchar_t>(Operator::Plus); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail







, . , , . .



TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); }





Operator



, .



enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', };





. IsOperator



Tokenizer



.



bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); }





. .



Interpreter.h #pragma once; #include <vector> #include <wchar.h> #include <algorithm> namespace Interpreter { enum class Operator : wchar_t { Plus = L'+', Minus = L'-', Mul = L'*', Div = L'/', LParen = L'(', RParen = L')', }; inline std::wstring ToString(const Operator &op) { return{ static_cast<wchar_t>(op) }; } enum class TokenType { Operator, Number }; inline std::wstring ToString(const TokenType &type) { switch(type) { case TokenType::Operator: return L"Operator"; case TokenType::Number: return L"Number"; default: throw std::out_of_range("TokenType"); } } class Token { public: Token(Operator op) :m_type(TokenType::Operator), m_operator(op) {} Token(double num) :m_type(TokenType::Number), m_number(num) {} TokenType Type() const { return m_type; } operator Operator() const { if(m_type != TokenType::Operator) throw std::logic_error("Should be operator token."); return m_operator; } operator double() const { if(m_type != TokenType::Number) throw std::logic_error("Should be number token."); return m_number; } friend inline bool operator==(const Token &left, const Token &right) { if(left.m_type == right.m_type) { switch(left.m_type) { case Interpreter::TokenType::Operator: return left.m_operator == right.m_operator; case Interpreter::TokenType::Number: return left.m_number == right.m_number; default: throw std::out_of_range("TokenType"); } } return false; } private: TokenType m_type; union { Operator m_operator; double m_number; }; }; inline std::wstring ToString(const Token &token) { switch(token.Type()) { case TokenType::Number: return std::to_wstring(static_cast<double>(token)); case TokenType::Operator: return ToString(static_cast<Operator>(token)); default: throw std::out_of_range("TokenType"); } } typedef std::vector<Token> Tokens; namespace Lexer { namespace Detail { class Tokenizer { public: Tokenizer(const std::wstring &expr) : m_current(expr.c_str()) {} void Tokenize() { while(!EndOfExperssion()) { if(IsNumber()) { ScanNumber(); } else if(IsOperator()) { ScanOperator(); } else { MoveNext(); } } } const Tokens &Result() const { return m_result; } private: bool EndOfExperssion() const { return *m_current == L'\0'; } bool IsNumber() const { return iswdigit(*m_current) != 0; } void ScanNumber() { wchar_t *end = nullptr; m_result.push_back(wcstod(m_current, &end)); m_current = end; } bool IsOperator() const { auto all = { Operator::Plus, Operator::Minus, Operator::Mul, Operator::Div, Operator::LParen, Operator::RParen }; return std::any_of(all.begin(), all.end(), [this](Operator o) {return *m_current == static_cast<wchar_t>(o); }); } void ScanOperator() { m_result.push_back(static_cast<Operator>(*m_current)); MoveNext(); } void MoveNext() { ++m_current; } const wchar_t *m_current; Tokens m_result; }; } // namespace Detail inline Tokens Tokenize(const std::wstring &expr) { Detail::Tokenizer tokenizer(expr); tokenizer.Tokenize(); return tokenizer.Result(); } } // namespace Lexer } // namespace Interpreter







InterpreterTests.cpp #include "stdafx.h" #include "CppUnitTest.h" #include "Interpreter.h" namespace InterpreterTests { using namespace Microsoft::VisualStudio::CppUnitTestFramework; using namespace Interpreter; using namespace std; namespace AssertRange { template<class T, class ActualRange> static void AreEqual(initializer_list<T> expect, const ActualRange &actual) { auto actualIter = begin(actual); auto expectIter = begin(expect); Assert::AreEqual(distance(expectIter, end(expect)), distance(actualIter, end(actual)), L"Size differs."); for(; expectIter != end(expect) && actualIter != end(actual); ++expectIter, ++actualIter) { auto message = L"Mismatch in position " + to_wstring(distance(begin(expect), expectIter)); Assert::AreEqual<T>(*expectIter, *actualIter, message.c_str()); } } } // namespace AssertRange TEST_CLASS(LexerTests) { public: TEST_METHOD(Should_return_empty_token_list_when_put_empty_expression) { Tokens tokens = Lexer::Tokenize(L""); Assert::IsTrue(tokens.empty()); } TEST_METHOD(Should_tokenize_single_plus_operator) { Tokens tokens = Lexer::Tokenize(L"+"); AssertRange::AreEqual({ Operator::Plus }, tokens); } TEST_METHOD(Should_tokenize_single_digit) { Tokens tokens = Lexer::Tokenize(L"1"); AssertRange::AreEqual({ 1.0 }, tokens); } TEST_METHOD(Should_tokenize_floating_point_number) { Tokens tokens = Lexer::Tokenize(L"12.34"); AssertRange::AreEqual({ 12.34 }, tokens); } TEST_METHOD(Should_tokenize_plus_and_number) { Tokens tokens = Lexer::Tokenize(L"+12.34"); AssertRange::AreEqual({ Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_skip_spaces) { Tokens tokens = Lexer::Tokenize(L" 1 + 12.34 "); AssertRange::AreEqual({ Token(1.0), Token(Operator::Plus), Token(12.34) }, tokens); } TEST_METHOD(Should_tokenize_complex_experssion) { Tokens tokens = Lexer::Tokenize(L"1+2*3/(4-5)"); AssertRange::AreEqual({ Token(1), Token(Operator::Plus), Token(2), Token(Operator::Mul), Token(3), Token(Operator::Div), Token(Operator::LParen), Token(4), Token(Operator::Minus), Token(5), Token(Operator::RParen) }, tokens); } }; TEST_CLASS(TokenTests) { public: TEST_METHOD(Should_get_type_for_operator_token) { Token opToken(Operator::Plus); Assert::AreEqual(TokenType::Operator, opToken.Type()); } TEST_METHOD(Should_get_type_for_number_token) { Token numToken(1.2); Assert::AreEqual(TokenType::Number, numToken.Type()); } TEST_METHOD(Should_get_operator_code_from_operator_token) { Token token(Operator::Plus); Assert::AreEqual<Operator>(Operator::Plus, token); } TEST_METHOD(Should_get_number_value_from_number_token) { Token token(1.23); Assert::AreEqual<double>(1.23, token); } }; }







GitHub . , . "__".



. , .







All Articles