File size: 3,019 Bytes
baa9131
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
#!/usr/bin/env python3
import argparse
import json
from pathlib import Path
from typing import Dict, Any

# Reuse the single-file generator
import sys
sys.path.append(str(Path(__file__).parent))
from make_mel_images import save_mel_image  # type: ignore


SUPPORTED_KEYS = [
    ("noisy", "noisy_mel"),
    ("miipher", "miipher_mel"),
    ("restored", "restored_mel"),
    ("sidon", "sidon_mel"),
    ("gt", "gt_mel"),
]


def process_manifest(
    manifest_path: Path,
    out_dir: Path,
    inplace: bool = True,
    sr: int = 22050,
    n_fft: int = 1024,
    hop_length: int = 256,
    n_mels: int = 80,
    fmin: int = 0,
    fmax: int | None = 8000,
) -> Path:
    data = json.loads(manifest_path.read_text())
    samples = data.get("samples", [])
    out_dir.mkdir(parents=True, exist_ok=True)

    for s in samples:
        sample_id = s.get("id") or s.get("title", "sample").lower().replace(" ", "_")
        for key, mel_key in SUPPORTED_KEYS:
            audio_path = s.get(key)
            if not audio_path:
                continue
            audio_p = Path(audio_path)
            if not audio_p.exists():
                print(f"[warn] missing audio: {audio_path}")
                continue
            out_path = out_dir / f"{sample_id}_{key}.png"
            print(f"[mel] {audio_path} -> {out_path}")
            save_mel_image(
                audio_path=audio_p,
                out_path=out_path,
                sr=sr,
                n_fft=n_fft,
                hop_length=hop_length,
                n_mels=n_mels,
                fmin=fmin,
                fmax=fmax,
            )
            s[mel_key] = str(out_path)

    out_path = manifest_path if inplace else manifest_path.with_suffix(".with_mels.json")
    out_path.write_text(json.dumps(data, ensure_ascii=False, indent=2) + "\n")
    return out_path


def main():
    p = argparse.ArgumentParser(description="Generate mel images for all samples in a manifest and update it.")
    p.add_argument("manifest", type=Path, nargs="?", default=Path("assets/samples.json"))
    p.add_argument("--out_dir", type=Path, default=Path("assets/mels"))
    p.add_argument("--no_inplace", action="store_true", help="Write to <manifest>.with_mels.json instead of overwriting")
    p.add_argument("--sr", type=int, default=22050)
    p.add_argument("--n_fft", type=int, default=1024)
    p.add_argument("--hop", dest="hop_length", type=int, default=256)
    p.add_argument("--mels", dest="n_mels", type=int, default=80)
    p.add_argument("--fmin", type=int, default=0)
    p.add_argument("--fmax", type=int, default=8000)
    args = p.parse_args()

    output_manifest = process_manifest(
        manifest_path=args.manifest,
        out_dir=args.out_dir,
        inplace=not args.no_inplace,
        sr=args.sr,
        n_fft=args.n_fft,
        hop_length=args.hop_length,
        n_mels=args.n_mels,
        fmin=args.fmin,
        fmax=args.fmax,
    )
    print(f"updated manifest: {output_manifest}")


if __name__ == "__main__":
    main()