CBC ソルバログの可視化 - Sabrou-mal サブロウ丸

最終的に描写するもの

暫定解値と, 下限値の時間推移
最適値

CBCログ

問題サイズなど

Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - xxxx
At line 2 NAME          MODEL
At line 3 ROWS
At line 2426 COLUMNS
At line 1454117 RHS
At line 1456539 BOUNDS
At line 1776810 ENDATA
Problem MODEL has 2421 rows, 320270 columns and 641750 elements

(中略)

暫定解値と下限値

Cbc0010I After 12600 nodes, 3099 on tree, 1885 best solution, best possible 1874 (1252.44 seconds)
Cbc0010I After 12700 nodes, 3138 on tree, 1885 best solution, best possible 1874 (1259.93 seconds)
Cbc0010I After 12800 nodes, 3201 on tree, 1885 best solution, best possible 1874 (1265.68 seconds)

(中略)

最終結果

Result - Optimal solution found

Objective value:                1883.00000000
Enumerated nodes:               95478
Total iterations:               1723840
Time (CPU seconds):             5497.38
Time (Wallclock seconds):       5521.17

Option for printingOptions changed from normal to all
Total time (CPU seconds):       5498.64   (Wallclock seconds):       5522.54%

読み込みコード

re.compile & re.match

正規表現モジュール re を用いて読み込みます. 今回のようにデータの形式(1行1行のデータ列)に統一性があり, その1行の中から特定の文字列や数字を取り出したいときにこれが便利です.

re.compileで, 読み込む行のパターンをコンパイルしたのち .match関数により読み込んだ行がこのパターンと一致するかをチェックします. もしパターンが一致していれば .groupdict()によりre.compileで指定したkeyからなる辞書で値を取り出せます. もしパターンと一致しなければNoneが返ってきます.

例

import re

Cbc0010I = re.compile(
    "Cbc0010I After (?P<node>.*) nodes, "
    +"(?P<num_subproblem>.*) on tree, "
    +"(?P<incumbent>.*) best solution, "
    +"best possible (?P<lower_bound>.*)"
    +"\((?P<time>.*) seconds\)"
)

row = 'Cbc0010I After 94500 nodes, 277 on tree, 1883 best solution, best possible 1882 (5489.11 seconds)'
m = Cbc0010I.match(row)
print(m.groupdict())
>>> {'node': '94500', 'num_subproblem': ' 277', 'incumbent': '1883', 'lower_bound': '1882', 'time': '5489.11'}

row = 'hogehoge'
m = Cbc0010I.match(row)
print(m)
>>> None

実際のコード

import re

Cbc0010I = re.compile(
    "Cbc0010I After (?P<node>.*) nodes, "
    +"(?P<num_subproblem>.*) on tree, "
    +"(?P<incumbent>.*) best solution, "
    +"best possible (?P<lower_bound>.*) "
    +"\((?P<time>.*) seconds\)"
)
Objective = re.compile(
    "Objective value: (?P<objective_value>.*)"
)

logs = list()

for row in open('cbc_log.txt', 'r'):
    # read incumbent, lower bound log
    m = Cbc0010I.match(row)
    if m is not None:
        logs.append(m.groupdict())
        continue
        
    # read objective value log
    m = Objective.match(row)
    if m is not None:
        objective_value = m.groupdict()['objective_value']

ちなみに, python3.8から導入された"セイウチ演算子"を用いると何行か節約できます.
(:= ← 横から見るとセイウチ

for row in open('cbc_log.txt', 'r'):
    # read incumbent, lower bound log
    if (m := Cbc0010I.match(row)) is not None:  # ここが変わった
        logs.append(m.groupdict())
    # read objective value log
    elif (m := Objective.match(row)) is not None:  # ここが変わった
        objective_value = m.groupdict()['objective_value']

描写

pandas + matplotlib が楽です.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(logs).astype(float)     # cast from str to float
objective_value = float(objective_value)  # cast from str to float

fig, ax = plt.subplots()

# plot incumbent and lower bound
df.plot(x='time', y='incumbent', ax=ax)
df.plot(x='time', y='lower_bound', ax=ax)

# plot objective value
x_min, x_max = 0, float(df.tail(1).time)
ax.hlines(objective_value, x_min, x_max, color='red', linestyles='--', label='optimal')

# some settings
ax.set_xlabel('Time [s]')
ax.set_ylabel('Objective value')
ax.grid('--')
ax.legend()