I parsed the last 11 pages of Soapbox removing the quoted comments and generated vector embeddings off the comments by user (not all of them it's super slow, but chugging along). This is a plot of the average embedding across the posts for the user for the ones I do have. Using PCA you can break down the dimensionality to 2D so you can plot it on a graph:What can you do with the embeddings:- Find clusters of users that have similar posting styles and beliefs and dox aliases- See the trend of posting style/belief over time- Generate a mathematical "Political Compass"- Ask questions like "is this poster a shitlib"?If this is against ToS let me know @qntmfred.[Edited on November 13, 2024 at 10:35 AM. Reason : damnnnnn]
11/13/2024 10:29:58 AM
spelling my name incorrectly is against ToS[Edited on November 13, 2024 at 10:39 AM. Reason : which model did you use]
11/13/2024 10:33:29 AM
OpenAI 'text-embedding-3-small'. I may try it with Llama 3.1 too if I get frustrated with how long this takes, my guess was it would be way faster than burning my own cycles, but that does not appear to be the case.
11/13/2024 11:08:34 AM
utowncha is tgl or dtral going by cosine similarity: The big image got scaled down in the gallery, but it's pretty neat.Added stars for the regulars and infamous posters.Raige is the most unusual in terms of the distance from others.[Edited on November 13, 2024 at 1:20 PM. Reason : a]
11/13/2024 12:55:36 PM
Shockingly insulting.
11/13/2024 5:46:39 PM
hey what do you expect when you get flattened down to two dimensionsi played around with some twwerbots a while back but the results were uninspiring. might be time to try again. i've been playing with the llama 3.2 vision model. just wait until we build profiles out of 20 years of livestreaming rather than 20 years of message boarding and tweeting.[Edited on November 13, 2024 at 6:01 PM. Reason : my how was your 2024 thread response will be generated by AI based on my livestreams]
11/13/2024 5:58:40 PM
^ yeah I built an esgarg one twice. It started using the n word and saying fag a lot so deleted the fine tune so my account didn't get cancelled.Think I mentioned the Llama one in the other thread. I've been super curious how they would respond to current events. Would be cool to have one trained on only subsets, like ubertsb or uberchitchat.[Edited on November 13, 2024 at 6:23 PM. Reason : A]
11/13/2024 6:23:29 PM
^^^^possible to add me? Wonder if I am close to a couple bros on here[Edited on November 13, 2024 at 6:26 PM. Reason : Too few][Edited on November 13, 2024 at 6:26 PM. Reason : Or two few aint that something]
11/13/2024 6:25:58 PM
I added you, you just may not in the view there. Will take another screenshot when I find you.StTexan top similar[(0.9573154205531155, 'rwoody'), (0.945110925776677, 'HaLo'), (0.9390791610460517, 'bbehe'), (0.9348659692783031, 'synapse'), (0.9337087257886083, 'Money_Jones'), (0.93322072709863, 'skywalkr'), (0.9314596584573368, 'The Coz'), (0.9304043675756769, 'qntmfred'), (0.9302928550201987, 'utowncha'), (0.9299006964490467, 'A Tanzarian')][Edited on November 13, 2024 at 6:52 PM. Reason : a]
11/13/2024 6:38:02 PM
Cool thats a pretty good list to be on I think. What exactly is the cosine similarity thing?Dang can you put top 50? Top 10 all real close at like .92]I bet my top 10 could rule the country better than yours!!! Lol[Edited on November 13, 2024 at 6:59 PM. Reason : Add another note]
11/13/2024 6:55:45 PM
Interesting.I'm flattered that a got a star!And I think this proves I wasn't Rem Lezar.
11/13/2024 8:14:31 PM
I don't see myself on there... am I that far gone?
11/13/2024 10:13:54 PM
Of course ddd isn't on here...the weirdo is off the charts!
11/13/2024 10:47:12 PM
Near nighthawk in the upper left. Aaronburro is around I think in this view.
11/13/2024 10:57:35 PM
^You got got!
11/14/2024 1:33:34 AM
pretty cool^^^I'm in the upper middle, towards the left of The CozSo this just looks for similarity in the words used, or is there something else?Also, aren't BubbleBobble and ReceiveDeath the same person? Yet, they're so far apart.Yet utowncha and thegoodlife3 are right next to one another.Maybe BB posted a lot more on one account when he was younger and had a different diction. Or tgl3 is just boringI should generate a bot based off of my current profile. I wonder what it would talk aboutthis should be an inbuilt feature on social media sitestalk to your clone[Edited on November 16, 2024 at 11:45 AM. Reason : it be interesting to do chit chat and the soapbox for a controlled set of active users and compare]
11/16/2024 11:40:40 AM
11/16/2024 12:11:19 PM
^^ If TGL is posting about the plight of the trans and utowncha was like, THIS FUCKING GUY, and posts in response they would be closer together since the meaning of the comments would be similar (about trans stuff). It also maps structure and word (token freq), but there are better techniques to determine similarities there. I bet a standard naive bayes classifier would work pretty well for that. My original interest and question was if you could map people into some kind of belief based on post comments.Makes a pretty good AI test bed. Would be cool to have it broken into something you could load with Pandas.[Edited on November 16, 2024 at 8:05 PM. Reason : A]
11/16/2024 8:04:06 PM