forked from projectpages/project-pages
-
Notifications
You must be signed in to change notification settings - Fork 0
/
使用aboboo提取原版电影音频和字幕文件制作anki暗记卡片.html
157 lines (152 loc) · 6 KB
/
使用aboboo提取原版电影音频和字幕文件制作anki暗记卡片.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<!DOCTYPE html>
<html lang="zh">
<head>
<!-- 2022-10-24 Mon 15:11 -->
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>使用aboboo提取原版电影音频和字幕文件制作anki暗记卡片</title>
<meta name="author" content="yefeiyu" />
<meta name="generator" content="Org Mode" />
<link rel="icon" href="/favicon.ico">
<meta name="theme-color" content="#ffffff">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="/style/toc.css"/>
<link rel="stylesheet" href="/style/tufte.css"/>
<link rel="stylesheet" href="/style/main.css"/>
<link rel="alternate" type="application/rss+xml" href="https://gongzhitaao.org/orgcss/org.css"/>
<link rel="stylesheet" type="text/css" href="https://gongzhitaao.org/orgcss/org.css"/>
<link rel="stylesheet" href="/style/copy-pre.css"/>
<link rel="stylesheet" href="/style/clipboard.css">
<link rel="stylesheet" href="/style/custom.css">
<script src="/js/copy-pre.js"></script>
<script src="/js/clipboard.js"></script>
<script src="/js/custom.js"></script>
<script src="https://cdn.jsdelivr.net/npm/clipboard@1/dist/clipboard.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.11/clipboard.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.11/clipboard.js" integrity="sha512-ePtegHW811NTnZd0Er1UxtBb8dizKEdSzANYy/UhxM40FC2yCWwb1CQrj03BPbrs6XdUkcHuyVn9Xq9q0Lm34g==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="http://cdnjs.cloudflare.com/ajax/libs/clipboard.js/1.4.0/clipboard.min.js"></script>
<link rel="stylesheet" href="/style/clip2.css"/>
<script src="/js/clip2.js"></script>
</head>
<body>
<div id="preamble" class="status">
<!-- Google Tag Manager (noscript) -->
<noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-NJRFJGX"
height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
<!-- End Google Tag Manager (noscript) -->
<header>
<a accesskey="h" href="/index.html"> HOME </a> |
<a href="#" id="edit-in-github">EDIT</a> |
<a href="https://yefeiyu.github.io/index.xml">RSS</a> |
<a accesskey="H" href="/about.html">ABOUT</a> |
<a href="https://github.com/yefeiyu">GITHUB</a>
</header>
</div>
<div id="content" class="content">
<header>
<h1 class="title">使用aboboo提取原版电影音频和字幕文件制作anki暗记卡片</h1>
</header>
<div id="outline-container-orge2f1806" class="outline-2">
<h2 id="orge2f1806"><span class="section-number-2">1.</span> </h2>
<div class="outline-text-2" id="text-1">
<p>
将电影文件和字幕文件改成同名,放到同一文件夹内。使用aboboo本地-打开,则aboboo在切分语音时自动按字幕时间戳切分。
</p>
</div>
</div>
<div id="outline-container-org6fa30f5" class="outline-2">
<h2 id="org6fa30f5"><span class="section-number-2">2.</span> </h2>
<div class="outline-text-2" id="text-2">
<p>
处理字幕。有时候需要原、译文对调;繁体乱码改简体;删除字幕后重新载入等操作。
</p>
</div>
</div>
<div id="outline-container-org5712d7f" class="outline-2">
<h2 id="org5712d7f"><span class="section-number-2">3.</span> </h2>
<div class="outline-text-2" id="text-3">
<p>
句子选择。打开后反选,然后在第二项中修改为”参照文本大于4个词的句子“,确定。
</p>
</div>
</div>
<div id="outline-container-orgf7aa28c" class="outline-2">
<h2 id="orgf7aa28c"><span class="section-number-2">4.</span> </h2>
<div class="outline-text-2" id="text-4">
<p>
导出选中句子到音频文件,字幕格式为utf-8,勾选自动剪除句子首尾静音,每一句一个句子。
</p>
</div>
</div>
<div id="outline-container-org40acf37" class="outline-2">
<h2 id="org40acf37"><span class="section-number-2">5.</span> </h2>
<div class="outline-text-2" id="text-5">
<p>
使用python,生成anki可以导入的create.csv文件。create.py文件代码如下:
</p>
<pre class="code"><code>#-*-coding:utf-8-*-
"""
My first python program
"""
import re
import os
import sys
import csv
#from sys import argv
import glob
import time
output = "Tobeimport.csv"
def Main():
curpath = os.getcwd()
lrcfiles = curpath + "//*.lrc"
flist = glob.glob(lrcfiles)
writer = csv.writer(file(output, 'wb+'))
pattern1 = re.compile(r'00]([\s\S]*) \t')
pattern2 = re.compile(r'\t([\s\S]*)')
i = 0
for f in flist:
i = i + 1
fh = file(f,'r')
content = fh.read()
english = pattern1.findall(content)
chinese = pattern2.findall(content)
print content
print english
print chinese
mp3file = f[:-3] + "mp3"
newname = str(time.time()) + str(i) + ".mp3"
print mp3file
print newname
os.rename(mp3file, newname)
question = "[sound:" + newname +"]"
if not english:
answer =content[10:]
else:
answer = english[0] + " " + question
writer.writerow([question, answer])
fh.close()
del writer
if __name__ == "__main__":
Main()
</code></pre>
</div>
</div>
<div id="outline-container-orgfd43175" class="outline-2">
<h2 id="orgfd43175"><span class="section-number-2">6.</span> </h2>
<div class="outline-text-2" id="text-6">
<p>
将媒体文件导入anki文件夹内。linux是anki安装后在主文件夹内生成documents文件夹;windows是安装后在user/pc-user-id/AppData/Roaming/Anki2/User 1/collection.media中。
</p>
</div>
</div>
</div>
<div id="postamble" class="status">
<footer>
<p>Author: yefeiyu <a aria-label="Follow @yefeiyu on GitHub" data-count-aria-label="# followers on GitHub" data-count-api="/users/jcouyang#followers" data-count-href="/jcouyang/followers" href="https://github.com/yefeiyu class="github-button">Follow @yefeiyu</a></p>
<p>Modified: 2022-10-22 Sat 16:50</p>
<p>Generated by: <a href="https://www.gnu.org/software/emacs/">Emacs</a> 28.1 (<a href="https://orgmode.org">Org</a> mode 9.5.3) × <a href="https://github.com/jcouyang/orgpress">OrgPress</a></p>
</footer>
</div>
</body>
</html>