#oOo if you kept one architecture you could perhaps use this to get newly digested data in context into a much compressed form
Explore tagged Tumblr posts
Text

incredible
#has to be the same arch tho - and means abliteration is likely transferable through very smol datasets given the same architecture 😈#makes sense though. Backpropagating outputs derived from a specific traits inference path in a model *should* result in similar traits#in a student model of the same arch because the transformer blocks are the same and frozen to the same values that get updated#in fine tuning toward those that made the provided outputs. Hence why finetuning can change alignment.#what excites *me* is that this is likely more efficient for generating data to distill out a permanent trait#from in-context learning than traditional datasets engineered to demonstrate said trait in plaintext. Hm. I must fuck about with this.#I bet you could distill a custom writing voice that way very efficiently. Which is a pain in the ass to re-write an entire dataset for.#and now I guess have another project u.u#oOo if you kept one architecture you could perhaps use this to get newly digested data in context into a much compressed form#for finetuning the base model to retain newly aqcuired skills or facts. hmm. definitely gotta fuck with this.#my actual strategy for fact-dumping on the v2 dataset will probably be token-level storage with recomputation#rather than direct CAG with an entire bloated copy of the computed cache.#but for smaller tweaks? I could see this being hella useful >.>
810 notes
·
View notes