VTubing on Linux
Signed-off-by: Xe <me@christine.website>
This commit is contained in:
parent
b0e2ed1da8
commit
061d069f72
|
@ -0,0 +1,233 @@
|
|||
---
|
||||
title: VTubing on Linux
|
||||
date: 2022-01-15
|
||||
series: vtuber
|
||||
tags:
|
||||
- envtuber
|
||||
- nixos
|
||||
- yearofthelinuxdesktop
|
||||
---
|
||||
|
||||
In my [last post](/blog/vtubing-setup-2022-01-13) I went through my VTubing
|
||||
setup on Windows and all the "generations" of setup that I've done over the last
|
||||
year. Thanks to the meddling of a certain nerd who is in the chat watching me
|
||||
write this, I have figured out a way to run this setup on Linux. The ultimate
|
||||
goal for this phase is to get all this running on my work laptop so I can use it
|
||||
for a webcam. However this post is just going to cover the Linux setup bits.
|
||||
|
||||
## Differences Between OSes
|
||||
|
||||
On Windows, this setup is really straightforward. VSeeFace provides a [webcam
|
||||
driver](https://www.vseeface.icu/#virtual-camera) that makes the output of the
|
||||
VSeeFace app pretend to be a USB webcam. Google Meets, OBS and the like can then
|
||||
pick that up like it was a normal webcam. The overall flow looks like this:
|
||||
|
||||
![The webcam connects over USB to VSeeFace, VSeeFace pretends to be a webcam to
|
||||
OBS and OBS sends video frames to
|
||||
Twitch.](/static/blog/vtubing-linux/windows.svg)
|
||||
|
||||
This doesn't work at all on Linux though. There's no real way to get VSeeFace (a
|
||||
windows application that runs under Unity) to directly pretend to be a webcam at
|
||||
this moment.
|
||||
|
||||
[Pedantically, you can probably get away with doing this using a combination of
|
||||
PipeWire, Video4Linux or some other incarnation like that, but the main point
|
||||
here is that VSeeFace is a Windows app and I don't think it's possible to make
|
||||
Linux-specific calls like that. Feel free to prove me
|
||||
wrong.](conversation://Mara/hacker)
|
||||
|
||||
So, instead we need to have VSeeFace directly output to OBS. This makes the flow
|
||||
look something like this:
|
||||
|
||||
![The webcam connects over USB to OpenSeeFace, OpenSeeFace sends UDP packets to
|
||||
VSeeFace, OBS grabs the VSeeFace window via XComposite, OBS then sends video
|
||||
frames to Twitch.](/static/blog/vtubing-linux/nixos.svg)
|
||||
|
||||
The main difference is that for some reason VSeeFace on Linux can't capture the
|
||||
webcam directly. This isn't an issue however because
|
||||
[OpenSeeFace](https://github.com/emilianavt/OpenSeeFace) can capture the webcam
|
||||
and then send the face capture data directly to VSeeFace instead. Then OBS can
|
||||
grab VSeeFace via XComposite like normal.
|
||||
|
||||
[There may be a way to do this in Wayland, however we haven't figured that out
|
||||
yet. Please let me know if you figure out a way to get this working in
|
||||
Wayland.](conversation://Mara/hacker)
|
||||
|
||||
One of the major usability differences here is that OpenSeeFace has support for
|
||||
tracking blinking. However, at the same time my avatar opens its eyes really
|
||||
slowly when I do blink. There's probably a slider I need to set to make this
|
||||
less...horrible, but overall it does work! I don't get this on Windows, that's
|
||||
interesting.
|
||||
|
||||
[Kieto, his eyes closed!](conversation://Numa/delet)
|
||||
|
||||
## Failed Attempts
|
||||
|
||||
One of the biggest stumbling points was the fact that VSeeFace is distributed as
|
||||
a 64 bit application. Somehow my naive usage of Wine in its default config
|
||||
caused me to create a 32 bit Wine prefix (it was then I learned that there are
|
||||
such things as 32 and 64 bit prefixes and how they are mutually incompatible),
|
||||
which made it impossible to launch VSeeFace because Wine would reject it for
|
||||
being a 64 bit program.
|
||||
|
||||
I went through several rounds of nuking `~/.wine`, trying to run it again,
|
||||
setting various weird environment variables, setting build overrides, it was a
|
||||
catastrophe.
|
||||
|
||||
Other people have reported that you need to use
|
||||
[Lutris](https://dumbotaku.com/info/401) to install and use VSeeFace on Linux.
|
||||
This did not work. This did not work at all. Trying to do it this way on a NixOS
|
||||
machine was an absolute waste of my time and was demoralizing and frustrating.
|
||||
|
||||
[I think it has to do with the fact that Lutris really really really really
|
||||
wants to have its own special snowflake vendored copies of Wine/Proton and it
|
||||
will fight you if you try to have your way otherwise.](conversation://Cadey/coffee)
|
||||
|
||||
Then I realized that I was doing all this on my work laptop. This laptop is
|
||||
fairly standard, but also incredibly cursed in its own unique and fun ways. It
|
||||
shipped with Windows, but also with all the annoying "screw you for wanting to
|
||||
use Linux" settings turned on. Getting to the point where a NixOS ISO would boot
|
||||
was an exercise in tedium and randomly flipping settings on and off.
|
||||
|
||||
So on the request of the aforementioned meddler, I tried running VSeeFace on my
|
||||
gaming tower.
|
||||
|
||||
It worked first try.
|
||||
|
||||
[AAAAAA](conversation://Cadey/coffee)
|
||||
|
||||
## How To Make This Creative Abomination Come To Fruition on NixOS
|
||||
|
||||
The easiest part of getting all this working is to download VSeeFace. You just
|
||||
[download the .zip](https://www.vseeface.icu/) from the main page and extract
|
||||
into your Downloads folder.
|
||||
|
||||
Then you need to add the following to your `configuration.nix` file:
|
||||
|
||||
```nix
|
||||
# ...
|
||||
environment.systemPackages = with pkgs; [
|
||||
# vseeface
|
||||
wine64
|
||||
winetricks
|
||||
];
|
||||
# ...
|
||||
```
|
||||
|
||||
Rebuild and then this will put Wine (as `wine64`) in your `$PATH`. Now you need
|
||||
to install the Arial font using winetricks:
|
||||
|
||||
```console
|
||||
$ env WINE=wine64 winetricks arial
|
||||
```
|
||||
|
||||
This will take a moment to create your Wine prefix in `~/.wine` and populate it
|
||||
with the needed fonts. VSeeFace uses the Arial font everywhere in the UI, so
|
||||
this is not an optional step.
|
||||
|
||||
Now, clone OpenSeeFace to somewhere:
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/emilianavt/OpenSeeFace ~/tmp/OpenSeeFace
|
||||
```
|
||||
|
||||
And then copy in this `shell.nix` file into the root of the git repo:
|
||||
|
||||
```nix
|
||||
{ pkgs ? import <nixpkgs> { } }:
|
||||
(pkgs.buildFHSUserEnv {
|
||||
name = "pipzone";
|
||||
targetPkgs = pkgs:
|
||||
(with pkgs; [
|
||||
python39
|
||||
python39Packages.pip
|
||||
python39Packages.virtualenv
|
||||
libGL
|
||||
libGLU
|
||||
glib
|
||||
]);
|
||||
runScript = "bash";
|
||||
}).env
|
||||
```
|
||||
|
||||
Then run `nix-shell` to activate an environment that will pretend to be a normal
|
||||
Linux system and paste in these commands to set up the Python environment:
|
||||
|
||||
```
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip3 install onnxruntime opencv-python pillow numpy
|
||||
```
|
||||
|
||||
This will install the dependencies into a python venv.
|
||||
|
||||
[We can't really use a normal Nix packaging flow here because <a
|
||||
href="https://github.com/jonringer/nixpkgs/commit/bc2b132f98b48220fa5ec148aa2ba170aeb9a891">onnixruntime
|
||||
was removed from nixpkgs</a>. This is okay though, we can hack around
|
||||
this!](conversation://Mara/hacker)
|
||||
|
||||
Then you can run OpenSeeFace and you will see many lines of output:
|
||||
|
||||
```console
|
||||
$ python facetracker.py -c 0 -W 1280 -H 720 --discard-after 0 --scan-every 0 --no-3d-adapt 1 --max-feature-updates 900
|
||||
```
|
||||
|
||||
This will show many lines that look something like this:
|
||||
|
||||
```
|
||||
Took 20.50ms (detect: 0.00ms, crop: 0.82ms, track: 17.70ms, 3D points: 1.93ms)
|
||||
Confidence[0]: 0.9148 / 3D fitting error: 12.7974 / Eyes: O, O
|
||||
```
|
||||
|
||||
This dumps most of the internal state of the face tracking algorithm. VSeeFace
|
||||
will pick up on this and then turn that into movement instructions for your
|
||||
waifu.
|
||||
|
||||
Finally you can make an XComposite capture in OBS and then use that to get
|
||||
things through to Twitch that way.
|
||||
|
||||
## Nice Wrapper Script
|
||||
|
||||
[All these instructions are lame, I just wanna get it done
|
||||
fast!](conversation://Numa/delet)
|
||||
|
||||
You can get this all running with a super hacky script like this!
|
||||
|
||||
```shell
|
||||
#!/usr/bin/env nix-shell
|
||||
#! nix-shell -p wget -p git -p winetricks -p wine64 -i bash
|
||||
|
||||
mkdir -p ~/tmp/VTubing
|
||||
cd ~/tmp/VTubing
|
||||
|
||||
wget https://github.com/emilianavt/VSeeFaceReleases/releases/download/v1.13.37b/VSeeFace-v1.13.37b.zip
|
||||
unzip VSeeFace-v1.13.37b.zip
|
||||
|
||||
WINE=wine64 winetricks arial
|
||||
|
||||
git clone https://github.com/emilianavt/OpenSeeFace
|
||||
|
||||
(cd OpenSeeFace && wget -O shell.nix https://gist.githubusercontent.com/Xe/d739fd94c81c1690645c8f4607058488/raw/100c8c5e43ed8dc4b19b890173234ff28b0f9c7e/shell.nix | base64 -d > shell.nix && nix-shell) &
|
||||
(cd VSeeFace && wine64 VSeeFace.exe) &
|
||||
|
||||
wait
|
||||
```
|
||||
|
||||
This will get you everything set up and ready to go in a flash! No warranty.
|
||||
|
||||
[You should really do this automagically with Nix.](conversation://Mara/hmm)
|
||||
|
||||
[Yes, I should, but that is for another day. This day is not today.](conversation://Cadey/coffee)
|
||||
|
||||
---
|
||||
|
||||
I'm really glad that I have this working on Linux though. I feel really bad
|
||||
about being known as a Linux enthusiast but then all of my streams are visibly
|
||||
using Windows. It's totally valid to want to start out on Windows because it's
|
||||
easier though. This stuff is baroque and complicated. Hopefully this will make
|
||||
the path a bit clearer if you want to do VTubing on Linux like I am.
|
||||
|
||||
This article was written live on Twitch! Check out the stream vod
|
||||
[here](https://www.twitch.tv/videos/1264594247), and in a few days it will be live on YouTube
|
||||
[here](https://youtu.be/cSR1ZA012aQ). Follow [my channel](https://twitch.tv/princessxen)
|
||||
and get notified when I go live with more writing.
|
|
@ -1,6 +1,7 @@
|
|||
---
|
||||
title: How I VTuber
|
||||
date: 2022-01-13
|
||||
series: vtuber
|
||||
tags:
|
||||
- ENVtuber
|
||||
---
|
||||
|
|
|
@ -0,0 +1,75 @@
|
|||
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||
<!-- Generated by graphviz version 2.40.1 (20161225.0304)
|
||||
-->
|
||||
<!-- Title: G Pages: 1 -->
|
||||
<svg width="316pt" height="318pt"
|
||||
viewBox="0.00 0.00 316.17 317.60" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 313.6)">
|
||||
<title>G</title>
|
||||
<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-313.6 312.174,-313.6 312.174,4 -4,4"/>
|
||||
<g id="clust1" class="cluster">
|
||||
<title>cluster_1</title>
|
||||
<polygon fill="#d3d3d3" stroke="#d3d3d3" points="8,-8 8,-230.8 230,-230.8 230,-8 8,-8"/>
|
||||
<text text-anchor="middle" x="119" y="-214.2" font-family="Times,serif" font-size="14.00" fill="#000000">NixOS</text>
|
||||
</g>
|
||||
<!-- webcam -->
|
||||
<g id="node1" class="node">
|
||||
<title>webcam</title>
|
||||
<ellipse fill="none" stroke="#000000" cx="80" cy="-291.6" rx="43.4183" ry="18"/>
|
||||
<text text-anchor="middle" x="80" y="-287.4" font-family="Times,serif" font-size="14.00" fill="#000000">webcam</text>
|
||||
</g>
|
||||
<!-- losf -->
|
||||
<g id="node3" class="node">
|
||||
<title>losf</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="80" cy="-180" rx="64.2436" ry="18"/>
|
||||
<text text-anchor="middle" x="80" y="-175.8" font-family="Times,serif" font-size="14.00" fill="#000000">OpenSeeFace</text>
|
||||
</g>
|
||||
<!-- webcam->losf -->
|
||||
<g id="edge1" class="edge">
|
||||
<title>webcam->losf</title>
|
||||
<path fill="none" stroke="#000000" d="M80,-273.1715C80,-255.539 80,-228.6924 80,-208.3391"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="83.5001,-208.0855 80,-198.0856 76.5001,-208.0856 83.5001,-208.0855"/>
|
||||
<text text-anchor="middle" x="93.6129" y="-243" font-family="Times,serif" font-size="14.00" fill="#000000">USB</text>
|
||||
</g>
|
||||
<!-- twitch -->
|
||||
<g id="node2" class="node">
|
||||
<title>twitch</title>
|
||||
<ellipse fill="none" stroke="#000000" cx="273" cy="-107" rx="35.3489" ry="18"/>
|
||||
<text text-anchor="middle" x="273" y="-102.8" font-family="Times,serif" font-size="14.00" fill="#000000">twitch</text>
|
||||
</g>
|
||||
<!-- lvsf -->
|
||||
<g id="node4" class="node">
|
||||
<title>lvsf</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="114" cy="-34" rx="50.3567" ry="18"/>
|
||||
<text text-anchor="middle" x="114" y="-29.8" font-family="Times,serif" font-size="14.00" fill="#000000">VSeeFace</text>
|
||||
</g>
|
||||
<!-- losf->lvsf -->
|
||||
<g id="edge2" class="edge">
|
||||
<title>losf->lvsf</title>
|
||||
<path fill="none" stroke="#000000" d="M82.7781,-161.739C85.736,-143.2703 90.8362,-113.9646 97.0042,-89 99.2096,-80.0738 102.0249,-70.4601 104.7268,-61.8057"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="108.0909,-62.7767 107.8034,-52.1857 101.4236,-60.6444 108.0909,-62.7767"/>
|
||||
<text text-anchor="middle" x="110.9979" y="-102.8" font-family="Times,serif" font-size="14.00" fill="#000000">UDP</text>
|
||||
</g>
|
||||
<!-- lobs -->
|
||||
<g id="node5" class="node">
|
||||
<title>lobs</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="192" cy="-180" rx="29.6339" ry="18"/>
|
||||
<text text-anchor="middle" x="192" y="-175.8" font-family="Times,serif" font-size="14.00" fill="#000000">OBS</text>
|
||||
</g>
|
||||
<!-- lobs->twitch -->
|
||||
<g id="edge4" class="edge">
|
||||
<title>lobs->twitch</title>
|
||||
<path fill="none" stroke="#000000" d="M208.7833,-164.8744C220.0653,-154.7066 235.1385,-141.1221 247.8461,-129.6695"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="250.3124,-132.1585 255.3977,-122.8638 245.6261,-126.9586 250.3124,-132.1585"/>
|
||||
</g>
|
||||
<!-- lobs->lvsf -->
|
||||
<g id="edge3" class="edge">
|
||||
<title>lobs->lvsf</title>
|
||||
<path fill="none" stroke="#000000" d="M180.9095,-162.9172C174.0943,-152.1868 165.337,-137.9693 158.2258,-125 146.7004,-103.9802 134.8947,-79.5101 126.383,-61.2421"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="129.5067,-59.658 122.1336,-52.0495 123.1527,-62.5952 129.5067,-59.658"/>
|
||||
<text text-anchor="middle" x="193.3871" y="-102.8" font-family="Times,serif" font-size="14.00" fill="#000000">XComposite</text>
|
||||
</g>
|
||||
</g>
|
||||
</svg>
|
After Width: | Height: | Size: 4.1 KiB |
|
@ -0,0 +1,75 @@
|
|||
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||
<!-- Generated by graphviz version 2.40.1 (20161225.0304)
|
||||
-->
|
||||
<!-- Title: G Pages: 1 -->
|
||||
<svg width="288pt" height="340pt"
|
||||
viewBox="0.00 0.00 288.17 340.43" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 336.4313)">
|
||||
<title>G</title>
|
||||
<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-336.4313 284.174,-336.4313 284.174,4 -4,4"/>
|
||||
<g id="clust1" class="cluster">
|
||||
<title>cluster_0</title>
|
||||
<polygon fill="#d3d3d3" stroke="#d3d3d3" points="8,-8 8,-253.6313 202,-253.6313 202,-8 8,-8"/>
|
||||
<text text-anchor="middle" x="105" y="-237.0313" font-family="Times,serif" font-size="14.00" fill="#000000">windows</text>
|
||||
</g>
|
||||
<!-- webcam -->
|
||||
<g id="node1" class="node">
|
||||
<title>webcam</title>
|
||||
<ellipse fill="none" stroke="#000000" cx="66" cy="-314.4313" rx="43.4183" ry="18"/>
|
||||
<text text-anchor="middle" x="66" y="-310.2313" font-family="Times,serif" font-size="14.00" fill="#000000">webcam</text>
|
||||
</g>
|
||||
<!-- wvsf -->
|
||||
<g id="node3" class="node">
|
||||
<title>wvsf</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="66" cy="-202.8313" rx="50.3567" ry="18"/>
|
||||
<text text-anchor="middle" x="66" y="-198.6313" font-family="Times,serif" font-size="14.00" fill="#000000">VSeeFace</text>
|
||||
</g>
|
||||
<!-- webcam->wvsf -->
|
||||
<g id="edge1" class="edge">
|
||||
<title>webcam->wvsf</title>
|
||||
<path fill="none" stroke="#000000" d="M66,-296.0028C66,-278.3703 66,-251.5237 66,-231.1704"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="69.5001,-230.9168 66,-220.9168 62.5001,-230.9169 69.5001,-230.9168"/>
|
||||
<text text-anchor="middle" x="79.6129" y="-265.8313" font-family="Times,serif" font-size="14.00" fill="#000000">USB</text>
|
||||
</g>
|
||||
<!-- twitch -->
|
||||
<g id="node2" class="node">
|
||||
<title>twitch</title>
|
||||
<ellipse fill="none" stroke="#000000" cx="245" cy="-129.8313" rx="35.3489" ry="18"/>
|
||||
<text text-anchor="middle" x="245" y="-125.6313" font-family="Times,serif" font-size="14.00" fill="#000000">twitch</text>
|
||||
</g>
|
||||
<!-- wcd -->
|
||||
<g id="node4" class="node">
|
||||
<title>wcd</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="104" cy="-45.4156" rx="46.4831" ry="29.3315"/>
|
||||
<text text-anchor="middle" x="104" y="-49.6156" font-family="Times,serif" font-size="14.00" fill="#000000">Webcam</text>
|
||||
<text text-anchor="middle" x="104" y="-32.8156" font-family="Times,serif" font-size="14.00" fill="#000000">Driver</text>
|
||||
</g>
|
||||
<!-- wvsf->wcd -->
|
||||
<g id="edge2" class="edge">
|
||||
<title>wvsf->wcd</title>
|
||||
<path fill="none" stroke="#000000" d="M70.3591,-184.7737C76.2169,-160.5076 86.7858,-116.7257 94.5186,-84.6925"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="97.9756,-85.2867 96.92,-74.7446 91.1711,-83.644 97.9756,-85.2867"/>
|
||||
</g>
|
||||
<!-- wobs -->
|
||||
<g id="node5" class="node">
|
||||
<title>wobs</title>
|
||||
<ellipse fill="#ffffff" stroke="#ffffff" cx="164" cy="-202.8313" rx="29.6339" ry="18"/>
|
||||
<text text-anchor="middle" x="164" y="-198.6313" font-family="Times,serif" font-size="14.00" fill="#000000">OBS</text>
|
||||
</g>
|
||||
<!-- wobs->twitch -->
|
||||
<g id="edge4" class="edge">
|
||||
<title>wobs->twitch</title>
|
||||
<path fill="none" stroke="#000000" d="M180.7833,-187.7056C192.0653,-177.5379 207.1385,-163.9534 219.8461,-152.5008"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="222.3124,-154.9898 227.3977,-145.6951 217.6261,-149.7899 222.3124,-154.9898"/>
|
||||
</g>
|
||||
<!-- wobs->wcd -->
|
||||
<g id="edge3" class="edge">
|
||||
<title>wobs->wcd</title>
|
||||
<path fill="none" stroke="#000000" d="M157.2339,-185.0797C147.9295,-160.6688 130.9424,-116.1015 118.655,-83.8645"/>
|
||||
<polygon fill="#000000" stroke="#000000" points="121.8385,-82.3893 115.0063,-74.2917 115.2975,-84.8825 121.8385,-82.3893"/>
|
||||
<text text-anchor="middle" x="166.8745" y="-125.6313" font-family="Times,serif" font-size="14.00" fill="#000000">Webcam</text>
|
||||
</g>
|
||||
</g>
|
||||
</svg>
|
After Width: | Height: | Size: 4.1 KiB |
Loading…
Reference in New Issue