#oOo if you kept one architecture you could perhaps use this to get newly digested data in context into a much compressed form | Explore Tumblr posts and blogs | Tumgik

#oOo if you kept one architecture you could perhaps use this to get newly digested data in context into a much compressed form

Explore tagged Tumblr posts

Visit Tumblr Blog

Explore Tumblr blogs with no restrictions, modern design and the best experience.

Last Seen Tumblr Blogs

metanoiasstuff

L’appel du vide

299 posts

utopiantattootribecorp

Utopian Tattoo Tribe Corp

3 posts

mintalyaus

İsimsiz

5 posts

rahsianafsu

Untitled

1K posts

sjuyr

sjuyr

5K posts

Fun Fact

Average visit duration of Tumblr.com is 10 mins and 25 secs.

jbirdoftheriverlands · 6 days ago

Text

Tumblr media

incredible

#has to be the same arch tho - and means abliteration is likely transferable through very smol datasets given the same architecture 😈#makes sense though. Backpropagating outputs derived from a specific traits inference path in a model *should* result in similar traits #in a student model of the same arch because the transformer blocks are the same and frozen to the same values that get updated #in fine tuning toward those that made the provided outputs. Hence why finetuning can change alignment.#what excites *me* is that this is likely more efficient for generating data to distill out a permanent trait #from in-context learning than traditional datasets engineered to demonstrate said trait in plaintext. Hm. I must fuck about with this.#I bet you could distill a custom writing voice that way very efficiently. Which is a pain in the ass to re-write an entire dataset for.#and now I guess have another project u.u #oOo if you kept one architecture you could perhaps use this to get newly digested data in context into a much compressed form #for finetuning the base model to retain newly aqcuired skills or facts. hmm. definitely gotta fuck with this.#my actual strategy for fact-dumping on the v2 dataset will probably be token-level storage with recomputation #rather than direct CAG with an entire bloated copy of the computed cache.#but for smaller tweaks? I could see this being hella useful >.>

810 notes · View notes