Python 巡回セールスマン問題

記事の更新をしたのでお知らせ。 bit-dpの理論と実装の解説pdfを作成しました。 inarizuuuushi.hatenablog.com

2022-07-14

MLBench: Distributed Machine Learning Benchmark

機械学習の分散フレームワークのベンンチマーク。アルゴリズムなどは限られたものを利用する。ハードウェア(GPU-CPUメモリ)、並列(worker)数、ネットワーク帯域は可変なのでこれらの性能評価に用いることができる。 website documentation github 実行自体は…

2022-07-12

ML Commons

深層学習

ML Commonsとは、機械学習アプリケーションのベンチマークであるMLPerfの管理を行う団体のこと。 website: github: 2022/06/14時点でのベンチマーク項目は下記。 Training Training: HPC Inference: Datacenter Inference: Edge Inference: Mobile Inference…

2022-07-06

Pytorchでの分散処理にMPI backendを使用する

python pytorch 分散処理 MPI 分散深層学習

python.distributedは、Point-to-Point通信や集団通信といった分散処理のAPIを提供しています。これにより、細かな処理をカスタマイズすることが可能です。通信のbackendとしては、pytorch 1.13時点では、MPI、GLOO、NCCLが選択できます。各backendで利用で…

#Pytorch #MPI #DistributedDataParallel #TORCH_DISTRIBUTED_DEBUG

2022-07-04

Transformerについて

transformerについてまとめてみました。 speakerdeck.com

2022-06-30

サーベイ: Automatic Graph Partitioning for Very Large-scale Deep Learning

論文サーベイ分散深層学習

Tanaka, Masahiro, et al. "Automatic graph partitioning for very large-scale deep learning." 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2021. @inproceedings{tanaka2021automatic, title={Automatic gra…

2022-06-28

サーベイ: Supporting Very Large Models using Automatic Dataflow Graph Partitioning

論文サーベイ分散深層学習

Wang, Minjie, Chien-chin Huang, and Jinyang Li. "Supporting very large models using automatic dataflow graph partitioning." Proceedings of the Fourteenth EuroSys Conference 2019. 2019. @inproceedings{wang2019supporting, title={Supporting v…

2022-06-24

Transformerによる翻訳システム自作; part5 multi layer block

Transformer自作

本稿ではEncoderとDecoderをmulti-block化します。これでTransformerの基本的な構造は実装できたことになります。 Encoderでは、Attention → FeedForwardを一つのブロックとして、これを複数個積み上げます。このブロックでは入力と出力の次元は同じなので、…

2022-06-22

Transformerによる翻訳システム自作; part4 FeedForward & Residual Connection

Transformer自作

本稿ではFeedForward層とResidual connection（残差接続）、正規化層を導入します。Transformerはattention機構とFeedForward機構から構成されており、それぞれに対して残差接続が行われています。実装 FeedForward FeedForwardとは循環構造を持たないニュ…

2022-06-20

Transformerによる翻訳システム自作; part3 Multi-head Attention

Transformer自作

本稿ではMulti-head Attentionについて実装を行います。 Multi-head Attention Q, K, Vを分割してそれぞれでscaled dot-product attentionを実行、結果を集約(concat) こちらの方が精度が良い(理由は分からんけども; 複数の文脈を取り出せるという効果も) そ…

2022-06-16

Transformerによる翻訳システム自作; part2 プロトタイプの作成（シンプルなTransformer）

Transformer自作

本稿では翻訳モデルのプロトタイプとして簡易化したTransformerを作成します。英語→日本語の翻訳モデルを下図の構成で作成します。上図は"dog is cute"をencoder、"犬はかわいい。"をdecoderに入力して"犬はかわいい。"を推論させるように学習させている様…

2022-06-14

Transformerによる翻訳システム自作; part1 事前処理 & 学習の大枠

Transformer自作

本稿では翻訳モデルを作成するにあたり必須な自然言語の事前処理について整理します。用語集コーパス (Corpus) コーパスとは "言語学において、自然言語処理の研究に用いるため、自然言語の文章を構造化し大規模に集積したもの" 「コーパス」（2022年6月2…

2022-06-10

Transformerによる翻訳システム自作

Transformer自作

Transformer構造(の一部)を用いた深層学習アーキテクチャは自然言語処理におけるデファクトスタンダードになっています。そこで、その心は何たるかを知るためにTransformerによる日英翻訳モデル作成をできるだけ自作することを目指します。以降複数の記事…

2022-06-08

サーベイ: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

論文サーベイ分散深層学習

@article{shoeybi2019megatron, title={Megatron-lm: Training multi-billion parameter language models using model parallelism}, author={Shoeybi, Mohammad and Patwary, Mostofa and Puri, Raul and LeGresley, Patrick and Casper, Jared and Catanza…

#Megatron-LM

2022-06-06

python, warning デバッグ

python

pythonでwarningをデバッグする際にpdbのソースコードデバッガでcatchしてくれると、とても便利ですよね。例えば下記のコードを実行すると import numpy as np x = np.ones((2, 2), dtype=np.float16) x[0, 0] = 1e4 y = x ** 2 $ python tmp.py /Users/tat…

2022-06-03

話題: マイクロサービス, 超離散

マイクロサービス超離散マイクロサービス小さな独立したコンポーネント(機能)からソフトウェアを構成することをマイクロサービスと呼ぶ。機能が分割している分開発をミクロに独立して行えたり、負荷に応じた動的な計算資源の割り当てが可能になっている…

2022-06-02

Sourcetrail, pythonエラー

python

オープンソースなど他人の書いたコードを読むのに Sourcetrail 便利ですよね! (フリーとは思えないクオリティだ) 久々にインストールして使ってみたらエラー起きてましたエラー内容エラーが起きたバージョン: 2021.4.19 Release 2021.4.19 · CoatiSoftware…

2022-06-01

small_parallel_enjaデータセット利用のサンプルコード, python, pytorch

python 自然言語処理 pytorch

田中コーパス python, pytorchによるサンプルコード pytorchでの使用を前提にして、 torchtext.vocab.Vocab (vocabulary)の作成 DatasetとDataLoaderの作成例を紹介しています。

2022-05-31

kfttデータセット利用のサンプルコード, python, pytorch

python 自然言語処理 pytorch

kfttについて python, pytorchでの使用例 pytorchでの使用を前提にして、 torchtext.vocab.Vocab (vocabulary)の作成 DatasetとDataLoaderの作成例を紹介しています。

2022-05-30

pulp: 制約追加の高速化

python pulp

環境 >>> import pulp >>> pulp.__version__ '2.5.1' 本文制約を大量に追加する場合例えば; を追加した場合、下記のコードだと実行時間 28.81 s かかります。(f(i)は実数を返す何かしらの関数) prob = pulp.LpProblem() # 変数の生成 x = [pulp.LpVariable…

2022-05-27

torchtext.vocabの Vocab, build_vocab_from_iterator, Vectorsサンプルコード

python 自然言語処理

torchの自然言語処理用のライブラリtorchtextのvocabモジュールのそれぞれのクラスのサンプルコード紹介している内容 torchtext.vocab.vocab torchtext.vocab.build_vocab_from_iterator torchtext.vocab.GloVe torchtext.vocab.FastText torchtext.vocab.C…

2022-05-26

サーベイ: ZeRO-Offload: Democratizing Billion-Scale Model Training

論文サーベイ省メモリ深層学習

@inproceedings{ren2021zero, title={$\{$ZeRO-Offload$\}$: Democratizing $\{$Billion-Scale$\}$ Model Training}, author={Ren, Jie and Rajbhandari, Samyam and Aminabadi, Reza Yazdani and Ruwase, Olatunji and Yang, Shuangyan and Zhang, Minjia a…

2022-05-25

サーベイ: ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

論文サーベイ

@inproceedings{rajbhandari2020zero, title={Zero: Memory optimizations toward training trillion parameter models}, author={Rajbhandari, Samyam and Rasley, Jeff and Ruwase, Olatunji and He, Yuxiong}, booktitle={SC20: International Conference…

2022-05-24

サーベイ: Training Deep Nets with Sublinear Memory Cost

論文サーベイ省メモリ深層学習

Chen, Tianqi, et al. "Training deep nets with sublinear memory cost." arXiv preprint arXiv:1604.06174 (2016). @article{chen2016training, title={Training deep nets with sublinear memory cost}, author={Chen, Tianqi and Xu, Bing and Zhang, Ch…

2022-05-23

分散深層学習(Distributed Deep Learning; Distributed DL)まとめ

自然言語処理などのタスクにおいて深層学習モデルは必須の道具になっています。近年はTransformerをベースにして同じアーキテクチャパターンを繰り返してモデルを巨大化させることや、学習データを増やすことで精度の向上を目指すのが主流の方向性の一つで…

2022-05-20

サーベイ: GPUメモリ管理の実行時最適化による大規模深層学習の高速化 (2018)

論文サーベイ深層学習省メモリ

@article{伊藤祐貴2018gpu, title={GPU メモリ管理の実行時最適化による大規模深層学習の高速化}, author={伊藤祐貴 and 今井晴基 and 根岸康 and 河内谷清久仁 and 松宮遼 and 遠藤敏夫 and others}, journal={研究報告ハイパフォーマンスコンピューティン…

2022-05-17

サーベイ: Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

分散深層学習論文サーベイ

https://dl.acm.org/doi/10.1145/3458817.3476209 paper: @inproceedings{10.1145/3458817.3476209, author = {Narayanan, Deepak and Shoeybi, Mohammad and Casper, Jared and LeGresley, Patrick and Patwary, Mostofa and Korthikanti, Vijay and Vainbr…

2022-05-16

seaborn.regplotでのdotやlineの属性を変更する

python

seaborn.regplotのサンプルコード(https://seaborn.pydata.org/generated/seaborn.regplot.html) import seaborn as sns; sns.set_theme(color_codes=True) tips = sns.load_dataset("tips") ax = sns.regplot( x="total_bill", y="tip", data=tips, ) ここ…

2022-05-13

サーベイ: Mesh-tensorflow:Deep learning for supercomputers

分散深層学習論文サーベイ

@article{shazeer2018mesh, title={Mesh-tensorflow: Deep learning for supercomputers}, author={Shazeer, Noam and Cheng, Youlong and Parmar, Niki and Tran, Dustin and Vaswani, Ashish and Koanantakool, Penporn and Hawkins, Peter and Lee, Hyouk…

2022-05-12

サーベイ: PipeDream: Generalized Pipeline Parallelism for DNN Training

分散深層学習論文サーベイ

https://dl.acm.org/doi/abs/10.1145/3341301.3359646?casa_token=L-sKQKrRoE4AAAAA%3AYKo9NPdnPyG6IouMN5jfTHTCYFAGORDxen32GKAteeSG-ROhqx_OX-hVOfuyHiVBXLLJH0RPujhFPEk @inproceedings{narayanan2019pipedream, title={PipeDream: generalized pipeline …

Sabrou-mal サブロウ丸

主にプログラミングと数学

2022-01-01から1年間の記事一覧

Python 巡回セールスマン問題

MLBench: Distributed Machine Learning Benchmark

ML Commons

Pytorchでの分散処理にMPI backendを使用する

Transformerについて

サーベイ: Automatic Graph Partitioning for Very Large-scale Deep Learning

サーベイ: Supporting Very Large Models using Automatic Dataflow Graph Partitioning

Transformerによる翻訳システム自作; part5 multi layer block

Transformerによる翻訳システム自作; part4 FeedForward & Residual Connection

Transformerによる翻訳システム自作; part3 Multi-head Attention

Transformerによる翻訳システム自作; part2 プロトタイプの作成（シンプルなTransformer）

Transformerによる翻訳システム自作; part1 事前処理 & 学習の大枠

Transformerによる翻訳システム自作

サーベイ: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

python, warning デバッグ

話題: マイクロサービス, 超離散

Sourcetrail, pythonエラー

small_parallel_enjaデータセット利用のサンプルコード, python, pytorch

kfttデータセット利用のサンプルコード, python, pytorch

pulp: 制約追加の高速化

torchtext.vocabの Vocab, build_vocab_from_iterator, Vectorsサンプルコード

サーベイ: ZeRO-Offload: Democratizing Billion-Scale Model Training

サーベイ: ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

サーベイ: Training Deep Nets with Sublinear Memory Cost

分散深層学習(Distributed Deep Learning; Distributed DL)まとめ

サーベイ: GPUメモリ管理の実行時最適化による大規模深層学習の高速化 (2018)

サーベイ: Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

seaborn.regplotでのdotやlineの属性を変更する

サーベイ: Mesh-tensorflow:Deep learning for supercomputers

サーベイ: PipeDream: Generalized Pipeline Parallelism for DNN Training