Hemm utilities
base64_decode_image(image)
Decodes a base64 encoded image string encoded using the function hemm.utils.base64_encode_image
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image
|
str
|
Base64 encoded image string encoded using the function |
required |
Returns:
Type | Description |
---|---|
Image
|
Image.Image: PIL Image object. |
Source code in hemm/utils.py
base64_encode_image(image_path, mimetype=None)
Converts an image to base64 encoded string to be logged and rendered on Weave dashboard.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_path
|
Union[str, Image]
|
Path to the image or PIL Image object. |
required |
mimetype
|
Optional[str]
|
Mimetype of the image. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
Base64 encoded image string. |
Source code in hemm/utils.py
publish_dataset_to_weave(dataset_path, dataset_name=None, prompt_column=None, ground_truth_image_column=None, split=None, data_limit=None, get_weave_dataset_reference=True, dataset_transforms=None, dump_dir='./dump', *args, **kwargs)
Publishes a HuggingFace dataset dictionary dataset as a Weave dataset.
Publish a subset of MSCOCO from Huggingface as a Weave Dataset
import weave
from hemm.utils import publish_dataset_to_weave
if __name__ == "__main__":
weave.init(project_name="t2i_eval")
def preprocess_sentences_column(example):
example["sentences"] = example["sentences"]["raw"]
return example
dataset_reference = publish_dataset_to_weave(
dataset_path="HuggingFaceM4/COCO",
prompt_column="sentences",
ground_truth_image_column="image",
split="validation",
dataset_transforms=preprocess_sentences_column,
data_limit=10,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_path
|
[type]
|
Path to the HuggingFace dataset. |
required |
dataset_name
|
Optional[str]
|
Name of the Weave dataset. |
None
|
prompt_column
|
Optional[str]
|
Column name for prompt. |
None
|
ground_truth_image_column
|
Optional[str]
|
Column name for ground truth image. |
None
|
split
|
Optional[str]
|
Split to be used. |
None
|
data_limit
|
Optional[int]
|
Limit the number of data items. |
None
|
get_weave_dataset_reference
|
bool
|
Whether to return the Weave dataset reference. |
True
|
dataset_transforms
|
Optional[List[Callable]]
|
List of dataset transforms. |
None
|
dump_dir
|
Optional[str]
|
Directory to dump the results. |
'./dump'
|
Returns:
Type | Description |
---|---|
Union[ObjectRef, None]
|
Union[ObjectRef, None]: Weave dataset reference if get_weave_dataset_reference is True. |