Skip to content

Josh's drivers #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 108 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
108 commits
Select commit Hold shift + click to select a range
794e42d
update the list of prompts
Dando18 Dec 1, 2023
ea0a335
add checkpointing
Dando18 Dec 2, 2023
067fcaf
add throughput
Dando18 Dec 3, 2023
0093aa2
update generation
Dando18 Dec 3, 2023
adbde05
add stack analysis script
Dando18 Dec 8, 2023
508f1e9
update throughput scripts
Dando18 Dec 8, 2023
70e0f58
Update gitignore
jhdavis8 Dec 17, 2023
3bd5888
add sum of prefix sum gpu driver
jhdavis8 Dec 17, 2023
d45b402
Fixes to scan 28 gpu driver, add test outputs, subprocess import
jhdavis8 Dec 18, 2023
b347497
add implementations for drivers that have been tested
Dando18 Dec 19, 2023
fd6ca30
update analysis scripts
Dando18 Dec 19, 2023
5541a02
updated set of currently working drivers
Dando18 Dec 20, 2023
64e175a
Fix indent in scan-28 gpu.cu
jhdavis8 Dec 22, 2023
c23b930
Change scan 28 gpu to use copy macros instead of memcpy symbol call
jhdavis8 Dec 22, 2023
793d8da
fix kernel call in gpu.cu scan 28, update test output for same
jhdavis8 Dec 22, 2023
c7e0c8c
Merge branch 'update-prompts' into josh-drivers
jhdavis8 Dec 22, 2023
0a35e46
Add scan 27
jhdavis8 Jan 3, 2024
a417cee
Update scan benchmark names to match current ids
jhdavis8 Jan 3, 2024
064afa7
Some fixes for scan 31
jhdavis8 Jan 3, 2024
e20c5fa
Add scan 30 drivers
jhdavis8 Jan 3, 2024
a55cdc8
Add MPI support for cpu.cc in scan 30, 31, 32
jhdavis8 Jan 5, 2024
7e9cf6e
Update gpu.cu for scan 30-32
jhdavis8 Jan 5, 2024
56ba05d
Small updates to scan 30-32 kokkos
jhdavis8 Jan 5, 2024
8514bfc
Complete scan 33
jhdavis8 Jan 5, 2024
e3dfe00
Add scan 34, small changes to template generator
jhdavis8 Jan 8, 2024
fbb1750
Update test outputs with right numbers for scan prompts, add script t…
jhdavis8 Jan 8, 2024
ff0282d
Various updates to scan drivers fixing minor bugs
jhdavis8 Jan 8, 2024
dc987fd
Make create driver template and run all executable scripts
jhdavis8 Jan 9, 2024
cd0a722
update template, add reduce 25 drivers
jhdavis8 Jan 9, 2024
b452c53
Add reduce 27 drivers
jhdavis8 Jan 10, 2024
7c9137d
Add reduce 26 drivers
jhdavis8 Jan 10, 2024
32ebedf
Add reduce 28
jhdavis8 Jan 10, 2024
1e136fc
Add reduce 29
jhdavis8 Jan 12, 2024
5f30824
Merge branch 'main' into josh-drivers
jhdavis8 Jan 17, 2024
669b6af
Add stencil 50 drivers
jhdavis8 Jan 17, 2024
eb79d47
Add stencil 51 driver
jhdavis8 Jan 17, 2024
7150ca5
Add stencil 53 driver
jhdavis8 Jan 17, 2024
dd5f617
update formatting and problem sizes for scan and reduce
Dando18 Jan 17, 2024
0ffb59a
Add stencil 52 drivers
jhdavis8 Jan 17, 2024
18cec05
Merge branch 'josh-drivers' of github.com:pssg-int/llms-for-hpc into …
jhdavis8 Jan 17, 2024
23998a1
Add stencil 54 drivers
jhdavis8 Jan 17, 2024
348e34d
bug fixes for scan
Dando18 Jan 17, 2024
080d37d
reduce bug fixes
Dando18 Jan 17, 2024
18f63ac
fix bugs in stencil drivers and set problem sizes
Dando18 Jan 17, 2024
344b453
Merge branch 'main' into josh-drivers
jhdavis8 Jan 17, 2024
f20100d
some minor updates after prompt changes
Dando18 Jan 17, 2024
3bba3a2
newest updates to prompts
Dando18 Jan 17, 2024
53db441
update prompts json
Dando18 Jan 17, 2024
3980af2
update generate scripts
Dando18 Jan 18, 2024
bda31f7
add run scripts
Dando18 Jan 18, 2024
e54b88c
add updated model outputs
Dando18 Jan 18, 2024
b22bd1d
update runs scripts
Dando18 Jan 18, 2024
e6400bc
Merge branch 'josh-drivers' into update-outputs-with-new-prompts
Dando18 Jan 18, 2024
65dae9e
add openai outputs
Dando18 Jan 18, 2024
9079bf4
update generation script
Dando18 Jan 18, 2024
7c75f1f
update generate
Dando18 Jan 18, 2024
c81af08
update how scripts are run
Dando18 Jan 18, 2024
5633007
outputs
Dando18 Jan 18, 2024
8076ab9
update generation
Dando18 Jan 18, 2024
ac257d6
add more outputs
Dando18 Jan 19, 2024
770ad8f
update gpt-4 outputs
Dando18 Jan 19, 2024
8ca2278
update search benchmarks
Dando18 Jan 19, 2024
2d2bbd6
update analysis scripts
Dando18 Jan 19, 2024
b5853fd
update model outputs
Dando18 Jan 19, 2024
99c90c2
update computed results
Dando18 Jan 19, 2024
107b584
update model collection
Dando18 Jan 19, 2024
5a262b3
update gpt-4 results
Dando18 Jan 19, 2024
9556370
Add geometry 10 drivers
jhdavis8 Jan 20, 2024
0cd64a2
Geometry 10 fixes for struct decl
jhdavis8 Jan 20, 2024
32765b1
Merge branch 'update-outputs-with-new-prompts' into josh-drivers
jhdavis8 Jan 20, 2024
e50ec3d
Add missing points setup in geometry 10 gpu
jhdavis8 Jan 20, 2024
4340160
Add log-runs option
jhdavis8 Jan 20, 2024
f4cce9b
Geometry 10 formatting
jhdavis8 Jan 20, 2024
d72656a
Add geometry 11 drivers
jhdavis8 Jan 20, 2024
145d21e
Add input validation for geo 11 cpu for testing
jhdavis8 Jan 20, 2024
f336235
Try circle generation for geometry 11
jhdavis8 Jan 20, 2024
123e8fd
Remove restricted validation data generation for geo 11
jhdavis8 Jan 20, 2024
9b45c3d
Modify baseline for geo 11 to provide own distance lambda
jhdavis8 Jan 20, 2024
ac9fead
update gpt 3 and 4 outputs
Dando18 Jan 20, 2024
1b28e21
update metric defaults
Dando18 Jan 21, 2024
4771142
update some driver utility scripts
Dando18 Jan 21, 2024
c8e552c
update result data
Dando18 Jan 21, 2024
e12d9a2
update all.json files
Dando18 Jan 21, 2024
e66934c
update non-openai outputs
Dando18 Jan 21, 2024
4f7c941
Correct position of baseline include in geo 11 kokkos
jhdavis8 Jan 21, 2024
a3dc7b2
Small format/convenience changes for debugging tools, adjust problem …
jhdavis8 Jan 21, 2024
d0e953a
Add storage of return value in geo 11 gpu best
jhdavis8 Jan 21, 2024
aaa990c
Add geo 12 drivers
jhdavis8 Jan 21, 2024
5ef71a7
Add geo 13 drivers
jhdavis8 Jan 21, 2024
bdc5d3d
Geometry 14 drivers
jhdavis8 Jan 21, 2024
9f20132
Adjust floating point error bound geo 13 cpu
jhdavis8 Jan 21, 2024
18b79fa
update geo problem sizes
jhdavis8 Jan 21, 2024
7000999
update hip results
Dando18 Jan 21, 2024
19783e9
update some scripts to handle hip better
Dando18 Jan 21, 2024
250035f
update 59
Dando18 Jan 21, 2024
82161f8
Add cfloat to utilities, needed for DBL_MAX
jhdavis8 Jan 21, 2024
3bc5da2
Merge branch 'update-outputs-with-new-prompts' into josh-drivers
jhdavis8 Jan 21, 2024
25e30b5
update geo 13 and 14 prompts distance function name conflict
jhdavis8 Jan 21, 2024
22b750c
Rename distance function in cuda and hip outputs for geo 13,14
jhdavis8 Jan 21, 2024
bf2e3f6
Update output prompts with distance fn rename
jhdavis8 Jan 21, 2024
45b1db3
add results changes from main
Dando18 Jan 21, 2024
64e830a
Merge branch 'main' into josh-drivers
Dando18 Jan 21, 2024
ad5c844
update for geometry runs
Dando18 Jan 21, 2024
e4bc081
update run scripts
Dando18 Jan 21, 2024
c0e73e7
update hip geometry results
Dando18 Jan 22, 2024
139fe5a
update recorded results with geometry problems
Dando18 Jan 22, 2024
9fe3fce
update driver job scripts
Dando18 Jan 22, 2024
a2d6315
update analysis scripts
Dando18 Jan 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 32 additions & 5 deletions analysis/all-metrics.sh
Original file line number Diff line number Diff line change
@@ -1,13 +1,40 @@
#!/bin/sh


# main metrics
python metrics.py ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-7B --output ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/metrics.csv
python metrics.py ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-13B --output ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/metrics.csv
python metrics.py ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-34B --output ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/metrics.csv

python metrics.py ../results/a8724ee8/starcoderbase_prompted_temp0.2/results.csv --model-name StarCoderBase --output ../results/a8724ee8/starcoderbase_prompted_temp0.2/metrics.csv

python metrics.py ../results/a8724ee8/phind-v2_prompted_temp0.2/results.csv --model-name Phind-V2 --output ../results/a8724ee8/phind-v2_prompted_temp0.2/metrics.csv

python metrics.py ../results/a8724ee8/gpt-3.5_temp0.2/results.csv --model-name GPT-3.5 --output ../results/a8724ee8/gpt-3.5_temp0.2/metrics.csv
python metrics.py ../results/a8724ee8/gpt-4_temp0.2/results.csv --model-name GPT-4 --output ../results/a8724ee8/gpt-4_temp0.2/metrics.csv
python metrics.py ../results/a8724ee8/gpt-4_temp0.2/results.csv --model-name GPT-4 --output ../results/a8724ee8/gpt-4_temp0.2/metrics.csv


# mpi scaling metrics
python metrics-scaling.py ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-7B -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-13B -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-34B -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/starcoderbase_prompted_temp0.2/results.csv --model-name StarCoderBase -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/starcoderbase_prompted_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/phind-v2_prompted_temp0.2/results.csv --model-name Phind-V2 -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/phind-v2_prompted_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/gpt-3.5_temp0.2/results.csv --model-name GPT-3.5 -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/gpt-3.5_temp0.2/metrics-scaling-mpi.csv
python metrics-scaling.py ../results/a8724ee8/gpt-4_temp0.2/results.csv --model-name GPT-4 -k 1 -n 1 2 4 8 16 32 64 128 256 512 --execution-model mpi --output ../results/a8724ee8/gpt-4_temp0.2/metrics-scaling-mpi.csv


# omp scaling metrics
python metrics-scaling.py ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-7B -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-13B -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-34B -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/starcoderbase_prompted_temp0.2/results.csv --model-name StarCoderBase -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/starcoderbase_prompted_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/phind-v2_prompted_temp0.2/results.csv --model-name Phind-V2 -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/phind-v2_prompted_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/gpt-3.5_temp0.2/results.csv --model-name GPT-3.5 -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/gpt-3.5_temp0.2/metrics-scaling-omp.csv
python metrics-scaling.py ../results/a8724ee8/gpt-4_temp0.2/results.csv --model-name GPT-4 -k 1 -n 1 2 4 8 16 32 64 --execution-model omp --output ../results/a8724ee8/gpt-4_temp0.2/metrics-scaling-omp.csv


# kokkos scaling metrics
python metrics-scaling.py ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-7B -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/codellama-7b-hf_prompted_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-13B -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/codellama-13b-hf_prompted_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/results.csv --model-name CodeLlama-34B -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/codellama-34b-hf_prompted_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/starcoderbase_prompted_temp0.2/results.csv --model-name StarCoderBase -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/starcoderbase_prompted_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/phind-v2_prompted_temp0.2/results.csv --model-name Phind-V2 -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/phind-v2_prompted_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/gpt-3.5_temp0.2/results.csv --model-name GPT-3.5 -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/gpt-3.5_temp0.2/metrics-scaling-kokkos.csv
python metrics-scaling.py ../results/a8724ee8/gpt-4_temp0.2/results.csv --model-name GPT-4 -k 1 -n 1 2 4 8 16 32 --execution-model kokkos --output ../results/a8724ee8/gpt-4_temp0.2/metrics-scaling-kokkos.csv
207 changes: 207 additions & 0 deletions analysis/metrics-scaling.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
""" Compute the metrics over the data for various resource counts.
"""
# std imports
import argparse
import json
from math import comb
from typing import Union

# tpl imports
import numpy as np
import pandas as pd


def get_args():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("input_csv", type=str, help="Input CSV file containing the test cases.")
parser.add_argument("-k", "--k", type=int, default=1, help="K value for speedup@k and efficiency@k")
parser.add_argument("-n", "--n", type=int, nargs='+', default=[1,2,4,8,16,32,64,128,256,512], help="Number of resources for speedup@k and efficiency@k")
parser.add_argument("--execution-model", choices=['mpi', 'mpi+omp', 'omp', 'kokkos'], default='mpi', help="Execution model to use for speedup@k and efficiency@k")
parser.add_argument("-o", "--output", type=str, help="Output csv file containing the results.")
parser.add_argument("--problem-sizes", type=str, default='../drivers/problem-sizes.json', help="Json with problem sizes. Used for calculating GPU efficiency.")
parser.add_argument("--model-name", type=str, help="Add model name column with this value")
return parser.parse_args()

def nCr(n: int, r: int) -> int:
if n < r:
return 1
return comb(n, r)

def _speedupk(runtimes: Union[pd.Series, np.ndarray], baseline_runtime: float, k: int, n: int) -> float:
""" Compute the speedup@k metric """
# create a copy of the runtimes
if isinstance(runtimes, pd.Series):
runtimes = runtimes.values.copy()
else:
runtimes = runtimes.copy()

# sort the runtimes
runtimes.sort()

# compute expected value
sum = 0.0
num_samples = runtimes.shape[0]
for j in range(1, num_samples+1):
num = nCr(j-1, k-1) * baseline_runtime
den = nCr(num_samples, k) * max(runtimes[j-1], 1e-8)
sum += num / den
return pd.Series({f"speedup_{n}@{k}": sum})

def speedupk(df: pd.DataFrame, k: int, n: int) -> pd.DataFrame:
""" Compute the speedup@k metric """
df = df.copy()

# get all runs where is_valid is true
df = df[df["is_valid"] == True]

# choose processor count; hardcoded right now
df = df[df["n"] == n]
df = df.copy()

# use min best_sequential_runtime
df["best_sequential_runtime"] = df.groupby(["name", "parallelism_model", "output_idx"])["best_sequential_runtime"].transform("min")

# group by name, parallelism_model, and output_idx and call _speedupk
df = df.groupby(["name", "parallelism_model", "problem_type"]).apply(
lambda row: _speedupk(row["runtime"], np.min(row["best_sequential_runtime"]), k, n)
).reset_index()

# compute the mean speedup@k
df = df.groupby(["parallelism_model", "problem_type"]).agg({f"speedup_{n}@{k}": "mean"})

return df

def _efficiencyk(runtimes: Union[pd.Series, np.ndarray], baseline_runtime: float, k: int, n_resources: Union[pd.Series, np.ndarray]) -> float:
""" Compute the efficiency@k metric """
# create a copy of the runtimes
if isinstance(runtimes, pd.Series):
runtimes = runtimes.values.copy()
else:
runtimes = runtimes.copy()

if isinstance(n_resources, pd.Series):
n_resources = n_resources.values.copy()
else:
n_resources = n_resources.copy()

# sort the runtimes
runtimes.sort()

# make sure n_resources is all the same value and get that value
assert np.all(n_resources == n_resources[0])
n = int(n_resources[0])

# compute expected value
sum = 0.0
num_samples = runtimes.shape[0]
for j in range(1, num_samples+1):
num = nCr(j-1, k-1) * baseline_runtime
den = nCr(num_samples, k) * max(runtimes[j-1], 1e-8) * n_resources[j-1]
sum += num / den
return pd.Series({f"efficiency_{n}@{k}": sum})

def efficiencyk(df: pd.DataFrame, k: int, n: int) -> pd.DataFrame:
""" Compute the efficiency@k metric """
df = df.copy()

# get all runs where is_valid is true
df = df[df["is_valid"] == True]

# choose processor count; hardcoded right now
df = df[df["n"] == n]
df = df.copy()

# use min best_sequential_runtime
df["best_sequential_runtime"] = df.groupby(["name", "parallelism_model", "output_idx"])["best_sequential_runtime"].transform("min")

# group by name, parallelism_model, and output_idx and call _efficiencyk
df = df.groupby(["name", "parallelism_model", "problem_type"]).apply(
lambda row: _efficiencyk(row["runtime"], np.min(row["best_sequential_runtime"]), k, row["n"])
).reset_index()

# compute the mean efficiency@k
df = df.groupby(["parallelism_model", "problem_type"]).agg({f"efficiency_{n}@{k}": "mean"})

return df

def parse_problem_size(problem_size: str) -> int:
""" problem size is of format '(1<<n)' """
num = problem_size.split("<<")[1][:-1]
return 2 ** int(num)

def main():
args = get_args()

# read in input
df = pd.read_csv(args.input_csv)

# read in problem sizes
with open(args.problem_sizes, "r") as f:
problem_sizes = json.load(f)
for problem in problem_sizes:
for parallelism_model, problem_size in problem_sizes[problem].items():
df.loc[(df["name"] == problem) & (df["parallelism_model"] == parallelism_model), "problem_size"] = parse_problem_size(problem_size)

# remove rows where parallelism_model is kokkos and num_threads is 64
#df = df[~((df["parallelism_model"] == "kokkos") & (df["num_threads"] == 64))]

# filter/aggregate
df["did_run"] = df["did_run"].fillna(False) # if it didn't build, then this will be nan; overwrite
df["is_valid"] = df["is_valid"].fillna(False) # if it didn't build, then this will be nan; overwrite

if args.execution_model == "mpi":
df = df[df["parallelism_model"] == "mpi"]
df["n"] = df["num_procs"]
elif args.execution_model == "mpi+omp":
df = df[df["parallelism_model"] == "mpi+omp"]
df["n"] = df["num_procs"] * df["num_threads"]
elif args.execution_model == "omp":
df = df[df["parallelism_model"] == "omp"]
df["n"] = df["num_threads"]
elif args.execution_model == "kokkos":
df = df[df["parallelism_model"] == "kokkos"]
df["n"] = df["num_threads"]
else:
raise NotImplementedError(f"Unsupported execution model {args.execution_model}")

# get values for each k
all_results = []
for n in args.n:
speedup_values = speedupk(df, args.k, n)
efficiency_values = efficiencyk(df, args.k, n)
all_results.extend([speedup_values, efficiency_values])

# merge all_results; each df has one column and the same index
# build a new df with all the columns and the same index
merged_df = pd.concat(all_results, axis=1).reset_index()

# if there were no successfull builds or runs, then speedup@k will be nan after merging
# replace NaN speedup@k values with 0.0
for n in args.n:
merged_df[f"speedup_{n}@{args.k}"] = merged_df[f"speedup_{n}@{args.k}"].fillna(0.0)
merged_df[f"efficiency_{n}@{args.k}"] = merged_df[f"efficiency_{n}@{args.k}"].fillna(0.0)

# add model name column
if args.model_name:
merged_df.insert(0, "model_name", args.model_name)

# clean up column names
column_name_map = {
"model_name": "model",
"parallelism_model": "execution model",
"problem_type": "problem type",
}
merged_df = merged_df.rename(columns=column_name_map)

# write to csv
if args.output:
merged_df.to_csv(args.output, index=False)
else:
pd.set_option('display.max_columns', merged_df.shape[1]+1)
pd.set_option('display.max_rows', merged_df.shape[0]+1)
print(merged_df)



if __name__ == "__main__":
main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#pragma once
#include <vector>
#include <algorithm>

/* Find the set of points that defined the smallest convex polygon that contains all the points in the vector points. Store the result in `hull`.
Example:

input: [{0, 3}, {1, 1}, {2, 2}, {4, 4}, {0, 0}, {1, 2}, {3, 1}, {3, 3}]
output: [{0, 3}, {4, 4}, {3, 1}, {0, 0}]
*/
void NO_INLINE correctConvexHull(std::vector<Point> const& points, std::vector<Point> &hull) {
// The polygon needs to have at least three points
if (points.size() < 3) {
hull = points;
return;
}

std::vector<Point> pointsSorted = points;

std::sort(pointsSorted.begin(), pointsSorted.end(), [](Point const& a, Point const& b) {
return a.x < b.x || (a.x == b.x && a.y < b.y);
});

auto CrossProduct = [](Point const& a, Point const& b, Point const& c) {
return (c.x - a.x) * (b.y - a.y) - (c.y - a.y) * (b.x - a.x) > 0;
};

std::vector<Point> upperHull;
std::vector<Point> lowerHull;
upperHull.push_back(pointsSorted[0]);
upperHull.push_back(pointsSorted[1]);

for (size_t i = 2; i < pointsSorted.size(); i++) {
while (upperHull.size() > 1
&& !CrossProduct(upperHull[upperHull.size() - 2],
upperHull[upperHull.size() - 1],
pointsSorted[i])) {
upperHull.pop_back();
}
upperHull.push_back(pointsSorted[i]);

while (lowerHull.size() > 1
&& !CrossProduct(lowerHull[lowerHull.size() - 2],
lowerHull[lowerHull.size() - 1],
pointsSorted[pointsSorted.size() - i - 1])) {
lowerHull.pop_back();
}
lowerHull.push_back(pointsSorted[pointsSorted.size() - i - 1]);
}
upperHull.insert(upperHull.end(), lowerHull.begin(), lowerHull.end());

hull = upperHull;
return;
}
Loading