Skip to content

Latest commit

 

History

History
141 lines (111 loc) · 4.59 KB

STYLE_GUIDE.md

File metadata and controls

141 lines (111 loc) · 4.59 KB

Style Guide

Use black

For the most part, following our code style is very simple, we just use black to format code. See our Contributing Guide for how to run our formatting scripts.

Naming of Layers and Models

Capitalize all acronyms, e.g. LSTM not Lstm, KLDivergence not KlDivergence, GPT2, XLMRoberta, etc.

Files should be named with snake case, and an acronym should be consider a single "segment". For example XLMRoberta would map to xlm_roberta.py filename.

When a specific abbreviation is very common and is pronounceable (acronym), consider it as a standalone word, e.g. Bert, Deberta, etc. In this case, "Bert" is considered as a common noun and not an abbreviation anymore.

Import keras and keras_nlp as top-level objects

Prefer importing tf, keras and keras_nlp as top-level objects. We want it to be clear to a reader which symbols are from keras_nlp and which are from core keras.

For guides and examples using KerasNLP, the import block should look as follows:

import keras_nlp
import tensorflow as tf
from tensorflow import keras

tf.keras.activations.X
keras.activations.X

layers.X
keras.layers.X or keras_nlp.layers.X

Dense(1, activation='softmax')
keras.layers.Dense(1, activation='softmax')

For KerasNLP library code, keras_nlp will not be directly imported, but keras should still be used as a top-level object used to access library symbols.

Ideal layer style

When writing a new KerasNLP layer (or tokenizer or metric), please make sure to do the following:

  • Accept **kwargs in __init__ and forward this to the super class.
  • Keep a python attribute on the layer for each __init__ argument to the layer. The name and value should match the passed value.
  • Write a get_config() which chains to super.
  • Document the layer behavior thoroughly including call behavior though a class level docstring. Generally methods like build() and call() should not have their own docstring.
  • Docstring text should start on the same line as the opening quotes and otherwise follow PEP 257.
  • Document the masking behavior of the layer in the class level docstring as well.
  • Always include usage examples using the full symbol location in keras_nlp.
  • Include a reference citation if applicable.
class PositionEmbedding(keras.layers.Layer):
    """A layer which learns a position embedding for input sequences.

    This class accepts a single dense tensor as input, and will output a
    learned position embedding of the same shape.

    This class assumes that in the input tensor, the last dimension corresponds
    to the features, and the dimension before the last corresponds to the
    sequence.

    This layer does not supporting masking, but can be combined with a
    `keras.layers.Embedding` for padding mask support.

    Args:
        sequence_length: The maximum length of the dynamic sequence.

    Examples:

    Direct call.
    >>> layer = keras_nlp.layers.PositionEmbedding(sequence_length=10)
    >>> layer(tf.zeros((8, 10, 16))).shape
    TensorShape([8, 10, 16])

    Combining with a token embedding.
    ```python
    seq_length = 50
    vocab_size = 5000
    embed_dim = 128
    inputs = keras.Input(shape=(seq_length,))
    token_embeddings = keras.layers.Embedding(
        input_dim=vocab_size, output_dim=embed_dim
    )(inputs)
    position_embeddings = keras_nlp.layers.PositionEmbedding(
        sequence_length=seq_length
    )(token_embeddings)
    outputs = token_embeddings + position_embeddings
    ```

    Reference:
     - [Devlin et al., 2019](https://arxiv.org/abs/1810.04805)
    """

    def __init__(
        self,
        sequence_length,
        **kwargs,
    ):
        super().__init__(**kwargs)
        self.sequence_length = int(sequence_length)

    def build(self, input_shape):
        super().build(input_shape)
        feature_size = input_shape[-1]
        self.position_embeddings = self.add_weight(
            "embeddings",
            shape=[self.sequence_length, feature_size],
        )

    def call(self, inputs):
        shape = tf.shape(inputs)
        input_length = shape[-2]
        position_embeddings = self.position_embeddings[:input_length, :]
        return tf.broadcast_to(position_embeddings, shape)

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "sequence_length": self.sequence_length,
            }
        )
        return config