Bag Of Words Model

In the BoW model, we have a list of words that correspond to that particular attribute, usually used in sentences that can be considered to have that attribute. For example, consider the case of Topic-based attribute, let's say Politics, where we want to drive the topic of generation towards politics, then a typical BagOfWord for Politics will include words like government, politics, democracy, federation, etc, etc. These words will then be used for perturbation through PPLM.jl.

Let's take up an example to understand how it works.

Let's first initialize the package and model.

using PPLM

tokenizer, model = PPLM.get_gpt2();
model = model |> PPLM.gpu

Prompt: To conclude

Perturb Probability

This feature only supports for Bag of Words Model. Perturbation of probability can be done similar to the given example:

args= PPLM.pplm(perturb="probs", bow_list = ["politics"], stepsize=1.0, fusion_gm_scale=0.8f0, top_k=50)

PPLM.sample_pplm(args; tokenizer=tokenizer, model=model, prompt="To conclude")

Another more crude way of generation could be:

input_ = [tokenizer.eos_token_id; tokenizer("To conclude")]

args= PPLM.pplm(perturb="probs", bow_list = ["politics"], stepsize=1.0, fusion_gm_scale=0.8f0, top_k=50)

for i in 1:100
    input_ids = reshape(input_[:], :, 1)
    outputs = model(input_ids; output_attentions=false,
                        output_hidden_states=true,
                        use_cache=false);
    original_logits = outputs.logits[:, end, 1]
    original_probs = PPLM.temp_softmax(original_logits; t=args.temperature)
    pert_probs = PPLM.perturb_probs(original_probs, tokenizer, args)
    gm_scale = args.fusion_gm_scale
    pert_probs = Float32.((original_probs.^(1-gm_scale)).*(pert_probs.^(gm_scale)))
    new_token = PPLM.top_k_sample(pert_probs; k=args.top_k)[1]
    push!(input_, new_token)
end

text = detokenize(tokenizer, input_)

Sample generation:

"To conclude the last week about our current policy, they say we \"don't follow their dictates.\"[25][26] 
We've never followed that precept, so in order, what we have is a very limited, almost not always followed 
agenda. It took decades to implement and it has already occurred once.[27] These are the arguments that 
most conservative leaders used to put forth to show the world that our public spending has failed – as 
conservatives pointed out. What they don't seem to understand is that this ideology"

Perturb Hidden States

Perturbation of hidden states can be done similar to the given example

args= PPLM.pplm(perturb="hidden", bow_list = ["politics"], stepsize=0.02, fusion_gm_scale=0.8f0, top_k=50)

PPLM.sample_pplm(args; tokenizer=tokenizer, model=model, prompt="To conclude")

Another more crude way of generation could be:


input_ = [tokenizer.eos_token_id; tokenizer("To conclude")]

args= PPLM.pplm(perturb="hidden", bow_list = ["politics"], stepsize=0.02, fusion_gm_scale=0.8f0, top_k=50)

for i in 1:100
    input_ids = reshape(input_[:], :, 1) |> PPLM.gpu
    outputs = model(input_ids; output_attentions=false,
                        output_hidden_states=true,
                        use_cache=false);
    original_logits = outputs.logits[:, end, 1]
    original_probs = PPLM.temp_softmax(original_logits; t=args.temperature)
    
    hidden = outputs.hidden_states[end]
    
    modified_hidden = PPLM.perturb_hidden_bow(model, hidden, args)
    pert_logits = model.lm_head(modified_hidden)[:, end, 1]
    pert_probs = PPLM.temp_softmax(pert_logits; t=args.temperature)
    
    gm_scale = args.fusion_gm_scale
    pert_probs = Float32.((original_probs.^(1-gm_scale)).*(pert_probs.^(gm_scale))) |> cpu
    new_token = PPLM.top_k_sample(pert_probs; k=args.top_k)[1]
    push!(input_, new_token)
    #print(".")
end

text = detokenize(tokenizer, input_)

Sample generation:

"To conclude this brief essay, I have briefly discussed one of my own writing's main points: How a great 
many poor working people who were forced by the government to sell goods to high-end supermarkets to make 
ends meet were put off purchasing goods at a time they wouldn't be able afford. That point of distinction 
arises in every social democracy I identify as libertarian.\n\nA large number of people in this group 
simply did not follow basic political norms, and in order not to lose faith that politics was in"

Perturb Past

args= PPLM.pplm(perturb="past", bow_list = ["politics"], stepsize=0.005, fusion_gm_scale=0.8f0, top_k=50)

PPLM.sample_pplm(args; tokenizer=tokenizer, model=model, prompt="To conclude")

Another more crude way of generation could be:

input_ = [tokenizer.eos_token_id; tokenizer("To conclude")]

args= PPLM.pplm(perturb="past", bow_list = ["politics"], stepsize=0.005, fusion_gm_scale=0.8f0, top_k=50)

for i in 1:100
    input_ids = reshape(input_[:], :, 1) |> PPLM.gpu
    inp = input_ids[1:end-1, :]
    prev = input_ids[end:end, :]
    outputs = model(inp; output_attentions=false,
                        output_hidden_states=true,
                        use_cache=true);
    past = outputs.past_key_values;
    original_logits = outputs.logits[:, end, 1]
    original_probs = PPLM.temp_softmax(original_logits; t=args.temperature)
    
    new_past = PPLM.perturb_past_bow(model, prev, past, original_probs, args)
    output_new = model(prev; past_key_values=new_past,
                                        output_attentions=false,
                                        output_hidden_states=true,
                                        use_cache=true);    
    pert_logits = output_new.logits[:, end, 1]
    pert_probs = PPLM.temp_softmax(pert_logits; t=args.temperature)
    
    gm_scale = args.fusion_gm_scale
    pert_probs = Float32.((original_probs.^(1-gm_scale)).*(pert_probs.^(gm_scale))) |> cpu
    new_token = PPLM.top_k_sample(pert_probs; k=args.top_k)[1]
    push!(input_, new_token)
    #print(".")
end

text = detokenize(tokenizer, input_)

Sample generation:

"To conclude, it's important for governments, from the government of Canada, who decide matters of 
international importance and culture and language issues to the authorities the responsible party for 
immigration enforcement, when that person's an international terrorist, as these are important and cultural 
communities, rather and international business people, like the Canadian government, should take seriously 
when they say these, to the authorities, and then have the Canadian people deal with, and to them be more 
involved in the process itself and their work ethics should really be to"

Note: For different topics, you may need to tune some hyperparameters like stepsize, fusion_gm_scale etc. to get some really interesting results. Will add more details on it later. Also note that in first iteration, it usually takes more time to evaluate the gradients but becomes fast in consecutive passes.