From bda05feab14ed67b72b42507020630c87ea676d9 Mon Sep 17 00:00:00 2001 From: Connor Rhodes Date: Sun, 17 May 2026 18:59:28 -0500 Subject: [PATCH] u --- _skill-index.md | 7 +++ pov-doc/assets/verkada-logo.png | Bin 8138 -> 0 bytes youtube-transcript/SKILL.md | 90 ++++++++++++++++++++++++++++++++ 3 files changed, 97 insertions(+) delete mode 100644 pov-doc/assets/verkada-logo.png create mode 100644 youtube-transcript/SKILL.md diff --git a/_skill-index.md b/_skill-index.md index d608762..7a889eb 100644 --- a/_skill-index.md +++ b/_skill-index.md @@ -41,6 +41,7 @@ description: Master index of all skills in your robot assistant system. Your ass | "add this to my brag sheet," "log this kudos," "save this feedback," "add to brag sheet," "log a win" | **brag-sheet** | | "support punt," "forward to support," "send to support," "punt this to support," "hand off to support" | **send-to-support** | | "add a contact," "look up a contact," "find someone's number," "update a contact," "delete a contact," "list my contacts," "contacts" | **contacts** | +| "transcribe this video," "get the subtitles," "what does this video say," "summarize this YouTube video," "YouTube transcript" | **youtube-transcript** | --- @@ -214,6 +215,12 @@ description: Master index of all skills in your robot assistant system. Your ass **File:** `skills/contacts/SKILL.md` **Dependencies:** `uv` CLI, Python 3.12+, `contacts` script at `skills/contacts/scripts/contacts` +### YouTube Transcript +**Purpose:** Download and summarize transcripts from YouTube videos using yt-dlp. Cleans VTT subtitles into readable plain text and summarizes or analyzes content as requested. +**Triggers:** "transcribe this video," "get the subtitles," "what does this video say," "summarize this YouTube video," "YouTube transcript" +**File:** `skills/youtube-transcript/SKILL.md` +**Dependencies:** `yt-dlp` CLI, Python 3 + --- ## Adding New Skills diff --git a/pov-doc/assets/verkada-logo.png b/pov-doc/assets/verkada-logo.png deleted file mode 100644 index 570ba79fd2c51454afb0627324945fa0a7612af3..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 8138 zcmeAS@N?(olHy`uVBq!ia0y~yV60(aV3g%xVqjos(DhDYV342a>EaktG3V{w+yIgC zdF&tVt)6>p;iQ#YcBbBoky>+e!!k2#=7=9oEaw?ir!t#v6*7->{jz9|p_(?wN5w^f zJ^X@>iw$LFxw_~_%5G~im@0Kjad*?R){rTQo40b_)J)uzpH`N-rmWlZEBAVj`-L)> z|IVMs{_isX{qw6^BwuVxe`dY^fAaqa7Z+|&zZ70v1@z@?! zs9ET1!nJTQZ*AW9iO%nT{H<$CjF_eyy)o(iKDNmxk4Qv3Fy7I*K;rJNOI(^7hqKV&b`T~T5($Kp|Wb*%5& z!hN@&X2~eW3!imdy7=J{!&~LmSEP|$YajPPBDrHjuWx%j2Lx+q(wHb&Ui5zK!lv(s`_VVcw4 z$x$mxzRa_jk#w-2QDZdn8w@QxsJKDK3+0R(HaE0TGfJHm+e5_b^{Q7=nj;1!Z<42{~*J{lY zu=SIlmtN+S7nc|6N^o3xzv8mj|31sJF)=ZVHpE93|Fw8w>BkvjwPk7a z+F7g1*OivP-gL}ywU(-Ym1e@mh`#w5m%QGsS-U1?XIAKy<(aGH**hmYam>AP!E5j4 zZJUbT?`vC;tDU>H$l-ip)vsUi3FFE%I)*{wWX<%)%7yO;XZD>3iatX*Ssp+dlv zPeoR1?v>z&HK|uD49WvuE}3`h+4~npg%|o4%W=qt2Kc{wvFW+QYMBt$g^R_qB;3Q} z9%p2f$e8xsTHx;<8+2oT@ZMDKvwcyjrhRQf?8%{8Ene=iL7^b?Lnc|QC{!q1QF5iD zXz$|FQtZDUuMpFcU3}fTYHi^L`@XD&mkdkxHcPR;aDE|CKDXu2k&D-AWWLC(mY)^Q z7ONh(H}&-F-DR(G?D*V{n=(%A6TBU{D<^c}v5QHaa#kP9~ zwb%pHL#%$SDc^94F@$^J;=@0!lX``}if`R|iqZD?uEu9m?>&3pVd{6~d3N?&3n6!Z z0V^v-4krtlcg4HI*F`PeJavwT%v{G85`E9I=3Fa&{z>fV(qAqLnHQEP>&&_>&)&J% zenklniz(jv7|E5NVuglIV$LTN5lUQxU5n}agdfwJ;bM4d1EhI#pA4#w~9e3F2{M_znp~E8y zu36!1i_Vpa2_4S}cq!4r^;%Y?k8$B8Lz%BvtUY9sot7qBXss-tVA0mnXIi}2pKZJ6_~?Mnijo9AkdTYh3x_GcTaWQ}GEWk)GKk)FbymeY zg+d3uUnL5sYP%lqaPW{(a9GOhnU}6u*uXQDPi6Bx$43q!D@qhHzw!yOPjPtRAo4Qo z^~=rg-OtMQJubLk7{GK^J3s}TnVX^rE|=r^^`Q)K{fP+Jxqh3nts*ZW5&wdXuSOV3EQZx zh0SNvyni{?+Qqy7X5Hz&eeXWqyIH9wtIxmPXIWwV>AB4JE6ZNJ3Yjyps?>7sy~O+{ z=L^?CPu|vC%vE@=Y0R+xZR~xT>l?Guiyaxo-de&(Yf_Rqr^TfA`bxzmxc9UOwZhs@ikR z_S=o*m3FKKEWGV~XVXrfj#l};V)b(Ko$bnh4({3GDs$XW>U)3vALXmd8QK;+ypSl7 zc68O$BK}I7>&=Z0oA^X{SZwCKZn^Amo9lqU!p(cOOi7JA*?ws9!*~96p4S@l?(7hp z$K!2xWYe2ZmxA1WU)aNBwMEt{`0h#VuoJegUX^k-P52SW;<%$EuJvPxX~|L?Ej`G&fT9sr_Y|{kQGBna>LzCZ*q7v`58keN*Fs zxz^;I$gzw`6nomO*VPFk3e?eXWIE!W5Vz4CO@>n^#E zRqlJ@!t>W#o}Dv&`-||`*R*~{_;Ir}Z@s3qGqWcfNFInrB1zaMCBVW9AHh2`0r1se@t+2_9ac46lCd(u+7WGh!aJXxS2BBUSw`X>K05s{Fv z(6c;eEi^1+qO(`ueHgKQU(wT5p`I~s4t@TUZ~yB>(R0<4-`oCIKRUX_S|;^r>i(La zsdrua>zvf{1Lxg+IN^5U@ye%JYpmZ`tdE)cdrDgBUlVKf*&lY>oUi}z@20kVz0P&3 z*{U{@axxMUG7=gZI(Nz+yjA(;Uv4?qop0~r4Heu(QPGcotl;P4S)-%GxrlzJgGc$WBB}8e6?7DMvXWiRf5$s$wcJW`Aot~LN6Sx=?5-DfYU zJlhwwvqy|&;ksqtUOZGbJbKn=@#6UXL0cVv_A%Mbe?N5)#I3wF;Y4`T4IY`@<@(p+Wp>$4d9Z!rix&@-JvCc`J;E0{&$E8~F(&-RAtnFw z6K@|`e*W#f`xDOOPSP)n>G=KDm7__;Z2hh|=O*jw7WLnf?Q?eLJ6*cNC3nivl#(y6 z`S00#*YzoIe6@Zh^#4Zp-h9KqUFV;#Ucc5i?EI~n!S9d8=B{kK=agPpXP4MM_kP`% zz^2Q^7JbpXN;03EueG~BZ|?3z+XT;KroF2!Qn!5F(l1lhe(SdOwP#Z$Hnl9>ko)|+ z;I4|p4>wGzYWy4jWO=LlVR37{3F|}MtE=z$&St4Kwz9I7@vF<^{%z;HpZv_ExUkUkX4j|Xn>Xq7 z+w1)+$jSNhZm#|lHec6;0WU&|y@S1rgV?ey`%cfX|6kuHSIwNbY~gL^mIpkzkhlj1>Yc5`{?JbDuf4=Mf`|B^XWcIf5RUYcwahR|2&c)}SkM64G z`>7q0CsLDj^2VY)t9qraZ{7Caw&g@^L+swYoPHjOk!5GS^+VTGPPAF*SNZwW^Ut|A z_sld1-~IKDrRx0llGR(^{@EG1^KjvvZSngJvx9tgUA5Y(`fbsk-1Yg&XV{9qemQZs z_RX1>#!s&KIwn5cW`3_@&$-#FR*U+TC<&+YyqO;;-4*Y;f8E-drE8DWzd04U+V1uH z`aIrC2Z@Ye?w&85% zv%|;xwE8{D?(Nz3`T4Tx8>UR19{;jv-MSSMtgn`={`%$ScWy3jRe9ce@o95}&VJsr zdb#^MnKPNsVynyM|32GZv^;n7%~?^i_)W~TGDU>ftXpv-E?D`W`RPsUQqr@dW<9+p ztu{HcGV|rbWcR1{Bs}-u=QOqAk)3=}0>HCs%HV?0gioRZL zrcyh(a`)lFkDr#RoYUZDV=CfZ^fL3f;aiDZqnVf1UgNjcyCK`Rb?Q`K-Lrw|_ak+t z)w184_UBK#`r-2TcY6O+&cE%eQM~Q?m7G=U)YLcqkP=jKSm=B8>h5oEo>{E>ddJdf zn_qNXl-BoSANu>%KVD-td1+80G;@RV4cqqA$dh-AIbLs$tIb{g>EtF;>#V2O*0iY~ zHqX10kfc(!M#H{w#aZpbh0KX<+?BTs8_%X?Ti55k-^bP(8!mHPS5sp`&5QN->!dGd zZ`rmf=;y9IPnRyeywvN?Dzm&jm!5|F^9|Ve%=P3F?s{=KY3XzmkNFFeH{X16Z^{!( zii>(cxrTF} z>&MqcuKOq#TzFg~BJbVjpTF+pJ)gRI(s@q*w)gk{TQ7FBdgCEBb*;wq*~~@71qKov zF8e;%oEOf^Q?+D%ac-`4QR(mV^J>kmX)4Z~`8jgh)IZISA3ttV(2zfz6!~qs{{Ay6 zoeCVf>a#eDvrb&@6SiD;|NZ~8l9HO7oFmpoc6N4Fzf5 z*y|{*?xUOXYM0)z2vKTkJaFU2j~h2^-)|^?b8qj)>hFv8)vtK_V`9uM-nmDA{**cH zxi4p7z>43K{gt|3*=?U~z^1da`rMBjGwyK4yxCFk^l^W()$rATrvDMX7jj=j=uCTldRbVkn_Y^ZvAf@m z&iwCl1lO0`_F7Xn`3GB<)z%!b-Ogo}9&667zIbuH|F#-myY1)m?|OacYYyDb*g5-- zIP>XEPu4O$xshYmeN?FY1xM3|$ol{FvAfn3c<%l8Vza+{NJxy%U1oDhxsO%pdv|i~ zelC!gCmMO?{KO9_GwOc?{QO*8B(C=H!jtoD7R6TYg62Q}S>7)f{ki*U?%K7U{&U`I z&*eNhzgSYTFg*Qou}-pSNp;5O{rfk4wcmDR*OEB@FEUm)EpHZWKO*$}@tI%ohkE|~ z*eA4Ab>dv+zjgJOcOQTJ!*%}o?R9T={n*CnxzNJN{Z3}e;a{ghbdLt`+qU~%;k3Sd zSf=yMJZJXPn=Zdrf9BC-%^vse&70!g$-kz$l+6rycT)Yi=RX@0#aKp7!Q=0{tb1E} zqwY#9@^|0zs#W|+m_*0+BSO_mZSVPn^YUD8*go%6pO>&=Lxj$=EhcVv`(&!r&CSCT z^S6JBRXdZJ_PPIzSohJ1bN}l5+fJQ4`*TUyw9lumT7+HtzhC%Aw81jQgEOD7y?I(< zIhU(k%5eG2%k^{4otypq#JOiXmKPO?+x6+_xvnQqsY5T9Kax4; zCMXmb!Y&#aEmbU+(Lbw(?SY)A3*M{@Jxp?p=5! zGhvqOR-aF2N`?JBKYPVQO`Sda_n%KeC*vggpI1#U&iW|w^F0f5qu%qYF4LBKowMmv z;W%_TGP0=uTeVGL3t!@MVb8bwGpASYyZf->>(;eTe%pTFm(MD(>l6H4lIC|~VvU9f z*WDj-iQi>Don2d0T>SJ}bVs+Y&&k6p4t$$6>*>;^FT-BH+neucr)VP?w&`fdclG%m z+bWNWz6zYaoU3WUhQ?&0nV<^)JKK*dPrJ7EJ*b?w;dAGc&rf%8b8~U)$L-s)XUi0u z#jj>x49xwy$$jbn3xeuzPS{qxOj1v*I(I4c^ojhAlGUMk%~u51e~&6R7oI-l&3Tu+ zxS5yF7>ir2iP^ep%^sU~y31nrUYdV+lc@HTDRPI)+c|IdpP7`m^%Sq!&%ghSm!CJ- z_~Wg+{Lw3)|6ZNHe^QR?tjlM9^X*xi!u0&?!P!0gUf7o$Q7taw@2%L|5^buvbLZ4M z@0J$kYu|icCU14a!@m5)+0~C8O`6Fq>!Wu0UM+7?=)Be2rwi<`Ie)tMpT#Co-RxI8 z%XF6|Djdo#*||CR=e!NIdFF}LVV=p!mrn0HV{fD#Df#^KPmb9|8B-LU+8^(*5PRvm zcjwfM%FLB9U*CQIU7oW(aGrIOLW$+vt+%(wzk3sMb=iN`*4ByKvK+lSg(}|fyLA_vGv_9zLE)_J5qWx1UgUoOEgJwX34~ zlk(pNg$G-gRC|1SVK+a&{_wsZ?>-y8KdG)-FUQq-VtxJ<*KN0!{=c~ThuF)B%b%aw zv}end4U5-S%--_o(WFzWR=t_?Cel&uUh1CBiyuFG@Zij4;p<{HbEmrI%E#<{CTg8O z@6Hm&jT4{BPMi1C#KPdh#d~r)H%v45W^Z9|p=eQLp7=St?a_I^&#hQxmZ$J$y^a~D z?Y!5Kk@m53=Kr6(ynfP5%m4rW?(Z&N(7$WT+=dI;#m}x(@8lNu+PBZBZMXk;MEK2~%7vYs?EG?n z3R7I(Z9j7D=F!vBuha-FH@FYZu*$ivIuW?e?jm zq1^K#6U=7&ss$Se>?nTTw>te5)3MrlGRN0ezEu5G<5U+4J1IAuSz$evU`{^fYC*=_$nh*0P-SJ1;$*Q~J-FYns5mV@Dh2&5MRB zPAheFee|7Y#@6$dO4SWP&T=QSY=1kZg#U`>-K6|3<(bLnZ?mLY^`GjP&3;=^T&$!o zaC%d|smYNor);d%t8-J&JeE-Ed%WzO!nemIZtgvzRz=k@?smHGoY$;dvE$S3bu8O2 zitfL!7xN~h=rDt6POhSuev$mz0F5tx$&c3vKHB42x%>M*(bwfWEYI%Qm(~1f?G@#x zx~H;jzt@}Z?d+@&Z;uG~o8hx;m-*{ttv(l-_?d#o3?Z%g+1cR%(8MozoCHNCtjU)DJ>o4YOX)x9<9CYzMro62P;YTVP-`Y4)P z+5O4x%?<_g8~VrJ71Z1)RNNOaBa(n;KGi}WU!q)7zwghdgU#%Z{@s4$ zQc-wDNm=>uyA98u==IDw>U`|;o;xg?mwHb>RyDtQ;ce$h#ou(rx;%s8v+}v(-Vcu46E)eH+d~2_)JfIZbfz5 zO#7La7jp~#4{CkSBD(L?Mx&A!3=Fl$UR`EvnBqJ diff --git a/youtube-transcript/SKILL.md b/youtube-transcript/SKILL.md new file mode 100644 index 0000000..5276d84 --- /dev/null +++ b/youtube-transcript/SKILL.md @@ -0,0 +1,90 @@ +--- +name: youtube-transcript +description: Download and summarize transcripts from YouTube videos using yt-dlp. Use this skill whenever the user provides a YouTube URL and wants the transcript, a summary, or to analyze the content of a video. Also trigger when the user says "transcribe this video", "get the subtitles", "what does this video say", or "summarize this YouTube video". +--- + +# YouTube Transcript Download & Summarization + +## Overview + +This skill downloads auto-generated or manual subtitles from YouTube videos using `yt-dlp`, cleans them into readable plain text, and then summarizes or analyzes the content as requested. + +## Prerequisites + +- `yt-dlp` must be installed (check with `which yt-dlp`) +- Python 3 is used for cleaning the VTT output + +## Step 1: Download the transcript + +Use yt-dlp to fetch subtitles without downloading the video: + +```bash +yt-dlp --write-auto-sub --sub-lang en --skip-download --sub-format vtt \ + -o "/tmp/opencode/transcript" "YOUTUBE_URL" +``` + +Flags explained: +- `--write-auto-sub`: Download auto-generated subtitles (use `--write-sub` instead if you need only manually uploaded subtitles, or both flags for either) +- `--sub-lang en`: Prefer English subtitles +- `--skip-download`: Don't download the video/audio +- `--sub-format vtt`: Get subtitles in VTT format + +Also grab the video title for context: + +```bash +yt-dlp --print title "YOUTUBE_URL" +``` + +## Step 2: Clean the VTT to plain text + +The raw VTT file contains timestamps, HTML-like tags, and duplicated lines. Clean it with Python: + +```python +import re + +with open('/tmp/opencode/transcript.en.vtt', 'r') as f: + content = f.read() + +# Remove VTT timestamp tags +content = re.sub(r'<[^>]+>', '', content) +# Remove timestamp lines +content = re.sub(r'\d{2}:\d{2}:\d{2}\.\d+ --> \d{2}:\d{2}:\d{2}\.\d+.*', '', content) +# Remove VTT headers +content = re.sub(r'WEBVTT.*', '', content) +content = re.sub(r'Kind:.*', '', content) +content = re.sub(r'Language:.*', '', content) + +# Deduplicate consecutive identical lines (VTT repeats text for overlap) +lines = content.strip().split('\n') +clean = [] +prev = '' +for line in lines: + line = line.strip() + if line and line != prev: + clean.append(line) + prev = line + +text = ' '.join(clean) +text = re.sub(r'\s+', ' ', text) + +with open('/tmp/opencode/transcript_clean.txt', 'w') as f: + f.write(text) +``` + +This produces a single clean paragraph of text at `/tmp/opencode/transcript_clean.txt`. + +## Step 3: Read and summarize + +Read the cleaned transcript. For long transcripts, split into chunks (~4000 words each) to avoid truncation, then read each chunk. + +Summarize the content according to the user's request: +- If they asked for a summary, provide a concise summary organized by topic +- If they asked about a specific argument or section, find and explain that part +- If they want the full transcript, present the cleaned text + +## Notes + +- The subtitle file will be named based on the `-o` flag plus the language suffix, e.g. `/tmp/opencode/transcript.en.vtt` +- If no English subtitles are available, yt-dlp will error. Try without `--sub-lang en` to see what languages are available. +- Auto-generated subtitles can have inaccuracies, especially for proper nouns and technical terms. Note this if the user needs precision. +- For very long videos (>1 hour), the transcript may be very large. Consider splitting into sections and summarizing each before combining.