genQC logo genQC logo genQC
  • Overview
  • Get Started
  • Tutorials
  • API Reference
  • Research
  • Code Repository
  1. Platform
  2. Tokenizer
  3. Circuits tokenizer

API Reference

  • Modules Overview
  • Release notes

  • Benchmark
    • Compilation benchmark
  • Dataset
    • Dataset balancing
    • Cached dataset
    • Quantum circuit dataset
    • Config dataset
    • Dataset helper functions
    • Mixed cached dataset
  • Inference
    • Evaluation metrics
    • Evaluation helper
    • Sampling functions
  • Models
    • Config model
    • Frozen OpenCLIP
    • Layers
    • Position encodings
    • Conditional qc-UNet
    • Encoder for unitaries
    • Clip
      • Frozen OpenCLIP
      • Unitary CLIP
    • Embedding
      • Base embedder
      • Rotational preset embedder
    • Transformers
      • Transformers and attention
      • CirDiT - Circuit Diffusion Transformer
      • Transformers
  • Pipeline
    • Callbacks
    • Compilation Diffusion Pipeline
    • Diffusion Pipeline
    • Diffusion Pipeline Special
    • Metrics
    • Multimodal Diffusion Pipeline
    • Pipeline
    • Unitary CLIP Pipeline
  • Platform
    • Circuits dataset generation functions
    • Circuits instructions
    • Simulation backend
    • Backends
      • Base backend
      • CUDA-Q circuits backend
      • Pennylane circuits backend
      • Qiskit circuits backend
    • Tokenizer
      • Base tokenizer
      • Circuits tokenizer
      • Tensor tokenizer
  • Scheduler
    • Scheduler
    • DDIM Scheduler
    • DDPM Scheduler
    • DPM Scheduler
  • Utils
    • Async functions
    • Config loader
    • Math and algorithms
    • Miscellaneous util

On this page

  • CircuitTokenizer
  • Test
  • Report an issue
  • View source
  1. Platform
  2. Tokenizer
  3. Circuits tokenizer

Circuits tokenizer

Class to tokenize quantum circuits. Encode and decode quantum circuits into and from tensor representations.

source

CircuitTokenizer


def CircuitTokenizer(
    vocabulary:dict[str, int] | dict[typing.Any, int], sign_labels:Optional=None
)->None:

Helper class that provides a standard way to create an ABC using inheritance.

Test

tensor = torch.tensor([
                [1, 0,-2],
                [0, 1, 2],
                [0, 0,-2],
            ], dtype=torch.int32)

params_tensor = torch.tensor([       # ... [max_params, time]
                    [-0.9,  0.9, 0],
                    [ 0.1, -0.7, 0]
                ])

tokenizer    = CircuitTokenizer({"u2":1, "ccx":2})
instructions = tokenizer.decode(tensor, params_tensor)

instructions.print()
print(instructions.instruction_names_set)
CircuitInstruction(name='u2', control_nodes=[], target_nodes=[0], params=[0.628318727016449, 6.91150426864624])
CircuitInstruction(name='u2', control_nodes=[], target_nodes=[1], params=[11.9380521774292, 1.8849557638168335])
CircuitInstruction(name='ccx', control_nodes=[0, 2], target_nodes=[1], params=[6.2831854820251465, 6.2831854820251465])
{'u2', 'ccx'}
enc_tensor, enc_params_tensor = tokenizer.encode(instructions)
enc_tensor, enc_params_tensor
(tensor([[ 1,  0, -2],
         [ 0,  1,  2],
         [ 0,  0, -2]], dtype=torch.int32),
 tensor([[-0.9000,  0.9000,  0.0000],
         [ 0.1000, -0.7000,  0.0000]]))
assert torch.allclose(tensor, enc_tensor)
assert torch.allclose(params_tensor, enc_params_tensor)
tokenizer = CircuitTokenizer({"u2":1, "ccx":2})
assert tokenizer.vocabulary == {'u2': 1, 'ccx': 2}
# test background token checking
tokenizer = CircuitTokenizer({"u2":0, "ccx":1, "h":2, "ry":3})
assert tokenizer.vocabulary == {"u2":1, "ccx":2, "h":3, "ry":4}
[WARNING]: The value 0 is reserved for background tokens, i.e. qubit time position which are not effected by gates.
[WARNING]: Automatically incrementing all vocabulary values by one ...
print(CircuitTokenizer.get_parametrized_tokens(tokenizer.vocabulary))
assert CircuitTokenizer.get_parametrized_tokens(tokenizer.vocabulary) == [1, 4]
[1, 4]
Back to top
Base tokenizer
Tensor tokenizer
 

Copyright 2025, Florian Fürrutter

  • Report an issue
  • View source