139. word break by 5ky7 · Pull Request #40 · 5ky7/arai60

5ky7 · 2026-03-20T12:05:35Z

liquo-rice · 2026-03-20T13:51:41Z

memo.md

+    static constexpr int kNotChecked = 0;
+    static constexpr int kNotConstructable = 1;


数字に意味がないなら、enumの方が良いでしょう。２つしか取り得る値がないなら、boolで良さそうです。

enumの存在を存じ上げなかったのでとても勉強になりました．
boolについてはvector<bool>は特殊化が起こるので，意図的に避けました．

liquo-rice · 2026-03-20T13:57:01Z

memo.md

+            return false;
+        }
+
+        for (int prefix_length = substring.size(); 1 <= prefix_length; --prefix_length) {


制約的に20文字より長い単語は辞書に存在しないので、無駄が多いかもしれません。

制約を見落としていました...ありがとうございます．
仰る通り，メンバ変数としてmax_ward_length = 20としておいて，

Suggested change

for (int prefix_length = substring.size(); 1 <= prefix_length; --prefix_length) {

for (int prefix_length = max_ward_length; 1 <= prefix_length; --prefix_length) {

でよさそうですね．

「辞書内の単語は20文字以内」という制約の妥当性について改めて考えてみると

ユーザIDなどであれば登録時に制約がかかるので妥当

自然言語の単語からなる辞書なら20文字以上の単語があり得るので（目的次第では）怪しい

といった感じでしょうか．
この辺まで実装時に見えるようになりたいです．

liquo-rice · 2026-03-20T13:57:42Z

memo.md

+                               const int start_pos,
+                               const std::unordered_set<string>& word_dict,
+                               std::vector<int>& is_constructable_from) {
+        string substring = s.substr(start_pos);


コピーを作成する必要がないなら、string_viewで良さそうです。

huyfififi · 2026-03-21T00:01:52Z

memo.md

+class Solution {
+public:
+    bool wordBreak(string s, vector<string>& wordDict) {
+        vector<bool> is_constructable(s.size() + 1, false); // is_constructable[i] :「sのi文字目」までが構成可能か？(i=0は空文字列として常にtrue)


std::vector の中でも std::vector<bool> だけ特殊化されており、使用する上で色々と注意点があるみたいです。私がこの前軽くGoogleした限り、単純なフラグ配列として使いたいだけなら、std::vector<char> や std::vector<uint8_t> などの方が無難そうでした。

huyfififi · 2026-03-21T00:05:36Z

memo.md

+    }
+
+    bool wordBreakHelper(string s, vector<string>& wordDict, set<string> memo) {
+        if (s.empty() || memo.contains(s)) return true;


ぶら下がり if はバグの原因になり得るので、避けられた方がいいかと思います。

nodchip · 2026-03-21T09:37:22Z

memo.md

+
+　メモ化を試みた際，単にメモを用意するだけではTLEだった．効率よくメモに書き込む必要があった．ここでの効率は，`wordDict`から文字数について貪欲に選ぶと言うことに対応していた．つまり，できるだけ早く`is_constructable_from`に書き込みをするべきで，それにはできるだけ早く末尾に近い（再帰呼び出しが起こらないことに対応）`start_pos`を引数にもつ`CheckConstructability()`の呼び出しに到達する必要がある．そのためには`start_pos`が大きくなる順に，つまり文字数が大きい順に`wordDict`から取り出すべきと言うことになる．
+
+　このTLEは事前に見積もれるようになりたい．最悪のケースでは


こちらのコメントをご参照ください。
Yuto729/LeetCode_arai60#16 (comment)

ありがとうございます！
こちらに従いますと，今回はmemo.mdに記述した算出法で$6 \times 10^5$ステップになるので，おおよそ6ms程度の時間になりそうですが，~10ms程度はTLEにならない印象なので，なぜ通過しないのか困惑しております．
定数倍の問題かと思い，以下のようにstring_viewを利用したり参照渡しに修正したりしてみましたがTLEになることに変わりはありませんでした．
（ステップ数の計算が間違っているかもしれないので考え直してみます．）

#include <vector> #include <string_view> class Solution { public: bool wordBreak(const string s, const vector<string>& wordDict) { std::string_view s_view{s.begin(), s.end()}; return CheckConstructability(s, wordDict); } private: bool CheckConstructability(const string_view s, const vector<string>& wordDict) { if (s.empty()) { return true; } for (const string& target : wordDict) { if (s == target) { return true; } // s != target if (s.size() < target.size()) { continue; } string_view prefix = s.substr(0, target.size()); string_view suffix = s.substr(target.size()); if (prefix != target) { continue; } if (CheckConstructability(suffix, wordDict)) { return true; } } return false; } };

掛け算じゃなくて、指数関数ですね。下の例だと、各ポジションで20回分岐するのを300回近く繰り返すので、雑な見積もりだと20^300ぐらいのオーダーですね。

入力:
- s = 'a' * 299 + 'b'
- wordDict = {"a", "aa", ... , 'a'*20, (あとはaを使わずbを含まない適当な文字列を1980個)}

なるほど，理解しました．確かに20回呼び出しのそれぞれでさらに20回呼び出し，を繰り返す形ですね．
ありがとうございます．

nodchip · 2026-03-21T09:40:49Z

memo.md

+
+class Solution {
+public:
+    bool wordBreak(const string s, const vector<string>& wordDict) {


引数を値型にすると、関数呼び出しのたびに引数のコピーが作られます。複合型で、なおかつ関数内で変更されないものについては、原則 const 参照渡しで渡すことをお勧めいたします。

ただし、 CPU のレジスターサイズ以下のサイズの型を参照渡しにすると、アセンブラーレベルでは間接アドレッシングを使用するコードが生成される場合があり、逆に遅くなる場合があります。これらは値渡しで渡すことをお勧めいたします。ただし、コンパイラーの最適化オプションで間接アドレッシングを使用するコードの生成が回避される場合もあります。

nodchip · 2026-03-21T09:45:19Z

memo.md

+            return true;
+        }
+
+        for (string target : wordDict) {


ranged for 文で受ける際も、内容を変更しないのであれば、 const 参照で受けることをおすすめします。 CPU のレジスターサイズ以下のサイズの型については値型で受けることをおすすめします。

nodchip · 2026-03-21T09:48:48Z

memo.md

+        }
+
+        for (string target : wordDict) {
+            if (s == target) {


for 文の外で

if (std::ranges::contains(wordDict, s)) { return true; }

としたほうがシンプルになると思います。

確かに，この箇所は取り出して分離させた方がロジックとしてはわかりやすいですね．
std::ranges::contains()の存在を知らなかったので勉強になりました．

nodchip · 2026-03-21T09:50:31Z

memo.md

+            }
+            string prefix = s.substr(0, target.size());
+            string suffix = s.substr(target.size());
+            if (prefix != target) {


if (s.starts_with(target)) {

のほうがシンプルだと思います。

こちらも存在を知らないメンバ関数だったので，標準ライブラリなどのクラスを用いるときによく使いそうな挙動を実現しようと思ったらメンバ関数にないか確認する癖をつけたいです．

nodchip · 2026-03-21T10:01:52Z

memo.md

+        return is_constructible(s.size() - 1, s, wordDict);
+    }
+private:
+    vector<int> memo;


こちらのコメントをご参照ください。
#22 (comment)

nodchip · 2026-03-21T10:03:29Z

memo.md

+                    continue;
+                }
+
+                if (wordDict_set.find(s.substr(start, end - start)) != wordDict_set.end()) { // このsubstr取得でO(n)の計算量が生じることに注意


std::string_view::starts_with() を使うと、コピーを作らずに処理できると思いました。

nodchip · 2026-03-21T10:06:21Z

memo.md

+        bool is_end = false;
+        unordered_map<char, TrieNode*> children;
+    };
+    TrieNode* root = new TrieNode();


Trie 木を構築したあと、メモリを解放していないのが気になりました。実動作環境だとメモリリークとなり、メモリを使いつくし、システムを応答不能にする可能性があります。 new でメモリを確保して生ポインターに格納した場合は delete で解放しましょう。 std::unique_ptr や std::shared_ptr を使用したほうがより安全だと思います。

ありがとうございます．
deleteを使う方法は

void DeleteTrie(TrieNode* node) { if (!node) { return; } for (auto& child : node->children) { DeleteTrie(child); } delete node; }

を用意した上で，デストラクタ

~Solution() { DeleteTrie(root); }

を用意すれば良いですね．

また，今回はTrieのChildrenは親を一つしか持たないので，unique_ptrでの実装が適切そうです．そちらで書き直したのが以下のコードになります．

class Solution { private: struct TrieNode { bool is_end = false; unordered_map<char, std::unique_ptr<TrieNode>> children; }; std::unique_ptr<TrieNode> root = std::make_unique<TrieNode>(); public: bool wordBreak(string s, vector<string>& wordDict) { for (string& word : wordDict) { TrieNode* node = root.get(); for (char c : word) { if (!node->children.contains(c)) { node->children[c] = std::make_unique<TrieNode>(); } node = node->children[c].get(); } node->is_end = true; } vector<bool> is_constructable(s.size() + 1, false); is_constructable[0] = true; for (int i = 0; i <= s.size(); ++i) { if (!is_constructable[i]) continue; TrieNode* node = root.get(); for (int j = i; j < s.size(); ++j) { char c = s[j]; if (!node->children.contains(c)) { break; } node = node->children[c].get(); if (node->is_end) { is_constructable[j + 1] = true; } } } return is_constructable[s.size()]; } };

vector<TrieNode>を作って、vectorへのindexをTrieNode内で持つ方がおそらく速いですね。

vectorを作って、vectorへのindexをTrieNode内で持つ方がおそらく速いですね。

unordered_map<char, std::unique_ptr<TrieNode>> children;の代わりに
unordered_map<char, int> children;を持つ，そして外部に
vector<std::unique_ptr<TrieNode>> charsを用意しておいて，
chars[children[c]]でアクセスする，と言う意図でしょうか．

こうするとなぜ早くなるのか，掴みきれていないです...
TrieNodeオブジェクトは何回も生成されるが，
unordered_map<char, std::unique_ptr<TrieNode>>の初期化は重く，
unordered_map<char, int>の初期化は軽いので，何回も生成されるのがオーバーヘッドとして積み重なる．
一方vector<std::unique_ptr<TrieNode>> charsは一度用意しておけば何回も作り直す必要はないのでオーバーヘッドが少なく済む，という意図であっていますでしょうか．

vector<TrieNode> nodes;
を用意して

struct TrieNode { bool is_last; unordered_map<char, int> char_to_child_node_index; };

とするイメージです。

vectorだと、ヒープからメモリ確保する回数が減り、また、TrieNodeがメモリ上で連続に並び、キャッシュに乗りやすくなるので、速くなりそうです。TrieNodeは、vectorで管理されるので、unique_ptrは不要になります。

unordered_mapの初期化のコストはどちらでもほぼ同じかと思います。

vector<TrieNode>を作って、vectorへのindexをTrieNode内で持つ方がおそらく速いですね。

補足：vector<TrieNode>をバッククオートで囲っていなかったため、テンプレートの部分が消えて表示されていたので修正いたしました。

nodchip · 2026-03-21T10:09:01Z

memo.md

+            TrieNode* node = root;
+            for (int j = i; j < s.size(); ++j) {
+                char c = s[j];
+                if (node->children.find(c) == node->children.end()) break;


find() と operator[] で、内部的にノードを見つける操作が 2 回走るのが気になりました。

auto it = node->children.find(c); if (it == node->children.end()) break; node = it->second;

とすると、ノードを見つける操作が 1 回で済みます。

ただ、コードがややうるさくなってしまうため、速度を重視しない場合は現状のままでもよいと思います。

nodchip · 2026-03-21T10:10:14Z

memo.md

+        while (!frontier.empty()) {
+            int start = frontier.top();
+            frontier.pop();
+            for(string word : wordDict) {


for のあとにスペースを空けることをお勧めいたします。

参考までにスタイルガイドへのリンクを共有いたします。

https://google.github.io/styleguide/cppguide.html#Horizontal_Whitespace

if (b) { // Space after the keyword in conditions and loops.

なお、このスタイルガイドは“唯一の正解”というわけではなく、数あるガイドラインの一つに過ぎません。チームによって重視される書き方や慣習も異なります。そのため、ご自身の中に基準を持ちつつも、最終的にはチームの一般的な書き方に合わせることをお勧めします。

5ky7 added 3 commits March 10, 2026 23:32

Step1 途中

2040046

Step1

51c5dab

Step2 正規表現に関して

1ebb59d

liquo-rice reviewed Mar 20, 2026

View reviewed changes

huyfififi reviewed Mar 21, 2026

View reviewed changes

nodchip reviewed Mar 21, 2026

View reviewed changes

		static constexpr int kNotChecked = 0;
		static constexpr int kNotConstructable = 1;

	for (int prefix_length = substring.size(); 1 <= prefix_length; --prefix_length) {
	for (int prefix_length = max_ward_length; 1 <= prefix_length; --prefix_length) {


		メモ化を試みた際，単にメモを用意するだけではTLEだった．効率よくメモに書き込む必要があった．ここでの効率は，`wordDict`から文字数について貪欲に選ぶと言うことに対応していた．つまり，できるだけ早く`is_constructable_from`に書き込みをするべきで，それにはできるだけ早く末尾に近い（再帰呼び出しが起こらないことに対応）`start_pos`を引数にもつ`CheckConstructability()`の呼び出しに到達する必要がある．そのためには`start_pos`が大きくなる順に，つまり文字数が大きい順に`wordDict`から取り出すべきと言うことになる．

		このTLEは事前に見積もれるようになりたい．最悪のケースでは

Conversation

5ky7 commented Mar 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liquo-rice Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

5ky7 Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liquo-rice Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liquo-rice Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liquo-rice Mar 25, 2026 •

edited

Loading

5ky7 Mar 26, 2026 •

edited

Loading

liquo-rice Mar 25, 2026 •

edited

Loading

liquo-rice Mar 26, 2026 •

edited

Loading