| Title: | An R Interface for Raven DataFrames (Beta0) |
|---|---|
| Description: | Provides an I/O interface between R data.frames and Raven DataFrames. Defines functions to both read and write DataFrame files, as well as serialize/deserialize data.frames/DataFrames. |
| Authors: | Phil Gaiser [aut, cre], Raven Computing [cph] |
| Maintainer: | Phil Gaiser <[email protected]> |
| License: | Apache License (== 2) |
| Version: | 0.2.0 |
| Built: | 2026-05-17 05:38:44 UTC |
| Source: | https://github.com/raven-computing/rdf |
The raw vector to be deserialized must represent a Raven DataFrame. That DataFrame is returned as an R data.frame object.
deserializeDataFrame(bytes)deserializeDataFrame(bytes)
bytes |
The vector of raw bytes to deserialize |
The column types from Raven DataFrames are mapped to the corresponding R types. More specifically, all integer types (byte, short, int, long) are mapped to the R 'integer' type. The floating point types (float, double) are mapped to the R 'double' type. Both string and char types are mapped to the R 'character' type. Booleans are mapped to the R 'logical' type. Binary columns are represented as R 'list' types containing raw vectors.
A data.frame object from the specified raw vector
readDataFrame() for reading DataFrame (.df)
files directly.
## Not run: # deserialize a raw vector representing a DataFrame df <- deserializeDataFrame(my.raw.vector) # get the types for all columns types <- sapply(df, typeof) ## End(Not run)## Not run: # deserialize a raw vector representing a DataFrame df <- deserializeDataFrame(my.raw.vector) # get the types for all columns types <- sapply(df, typeof) ## End(Not run)
The file to be read must be a DataFrame (.df) file. The content of the file is returned as an R data.frame object.
readDataFrame(filepath)readDataFrame(filepath)
filepath |
The path to the file to read |
The column types from Raven DataFrames are mapped to the corresponding R types. More specifically, all integer types (byte, short, int, long) are mapped to the R 'integer' type. The floating point types (float, double) are mapped to the R 'double' type. Both string and char types are mapped to the R 'character' type. Booleans are mapped to the R 'logical' type. Binary columns are represented as R 'list' types containing raw vectors.
A data.frame object
deserializeDataFrame() for deserializing vectors
of raw bytes. writeDataFrame() for writing DataFrame files which can be read
by this function.
## Not run: # read a .df file into memory df <- readDataFrame("/path/to/my/file.df") # get the types for all columns types <- sapply(df, typeof) ## End(Not run)## Not run: # read a .df file into memory df <- readDataFrame("/path/to/my/file.df") # get the types for all columns types <- sapply(df, typeof) ## End(Not run)
The R data.frame is serialized as a Raven DataFrame. The concrete column types to use for each individual data.frame column can be specified by the 'types' argument.
serializeDataFrame(df, types = NULL, compress = FALSE, as.nullable = FALSE)serializeDataFrame(df, types = NULL, compress = FALSE, as.nullable = FALSE)
df |
The data.frame object to serialize |
types |
The type names for all column types. Must be a vector of character values. May be NULL |
compress |
A logical indicating whether to compress the content of the returned raw vector |
as.nullable |
A logical indicating whether the data.frame should be serialized as a NullableDataFrame, even if it contains no NA values |
The column types of the R data.frame object are mapped to the corresponding Raven DataFrame column types. The following types exist:
| Type name | Description |
| byte | int8 |
| short | int16 |
| int | int32 |
| long | int64 |
| float | float32 |
| double | float64 |
| string | UTF-8 encoded unicode string |
| char | single printable ASCII character |
| boolean | logical value TRUE or FALSE |
| binary | arbitrary length byte array |
By default, if the 'types' argument is not explicitly specified, all values are mapped to the corresponding largest possible type in order to avoid possible loss of information. However, users can specify the concrete type for each column in the DataFrame file to be written. This is done by providing a vector of character values denoting the type name of each corresponding data.frame column. The index of each entry corresponds to the index of the column in the underlying data.frame to persist.
If the specified data.frame object contains at least one NA value, then the serialized DataFrame will represent a NullableDataFrame. If the data.frame contains no NA values, then the serialized DataFrame will represent a DefaultDataFrame, unless the 'as.nullable' argument is set to TRUE.
The logical 'compress' argument specifies whether the serialized DataFrame is compressed.
A raw vector representing the serialized date.frame object
writeDataFrame() for directly persisting data.frame objects
to the file system
## Not run: # get a data.frame df <- cars # serialize the data.frame to a raw vector vec <- serializeDataFrame(df) # specify the concrete types of all columns coltypes <- c("float", "double") # serialize the data.frame to a raw vector with concrete types serializeDataFrame(df, types = coltypes) ## End(Not run)## Not run: # get a data.frame df <- cars # serialize the data.frame to a raw vector vec <- serializeDataFrame(df) # specify the concrete types of all columns coltypes <- c("float", "double") # serialize the data.frame to a raw vector with concrete types serializeDataFrame(df, types = coltypes) ## End(Not run)
The R data.frame is persisted as a DataFrame (.df) file. The concrete column types to use for each individual data.frame column can be specified by the 'types' argument.
writeDataFrame(filepath, df, types = NULL, as.nullable = FALSE)writeDataFrame(filepath, df, types = NULL, as.nullable = FALSE)
filepath |
The path to the file to write |
df |
The data.frame object to write |
types |
The type names for all column types. Must be a vector of character values. May be NULL |
as.nullable |
A logical indicating whether the data.frame should be persisted as a NullableDataFrame, even if it contains no NA values |
The column types of the R data.frame object are mapped to the corresponding Raven DataFrame column types. The following types exist:
| Type name | Description |
| byte | int8 |
| short | int16 |
| int | int32 |
| long | int64 |
| float | float32 |
| double | float64 |
| string | UTF-8 encoded unicode string |
| char | single printable ASCII character |
| boolean | logical value TRUE or FALSE |
| binary | arbitrary length byte array |
By default, if the 'types' argument is not explicitly specified, all values are mapped to the corresponding largest possible type in order to avoid possible loss of information. However, users can specify the concrete type for each column in the DataFrame file to be written. This is done by providing a vector of character values denoting the type name of each corresponding data.frame column. The index of each entry corresponds to the index of the column in the underlying data.frame to persist.
If the specified data.frame object contains at least one NA value, then the DataFrame file to be persisted will represent a NullableDataFrame. If the data.frame contains no NA values, then the DataFrame file to be persisted will represent a DefaultDataFrame, unless the 'as.nullable' argument is set to TRUE.
The number of bytes written to the specified file
serializeDataFrame() for serializing data.frame objects
to vectors of raw bytes. readDataFrame() for reading DataFrame files which have been
previously persisted by this function.
## Not run: # get a data.frame df <- cars # write the data.frame to a .df file writeDataFrame("cars.df", df) # specify the concrete types of all columns coltypes <- c("float", "double") # write the data.frame to a .df file with concrete types writeDataFrame("cars.df", df, types = coltypes) ## End(Not run)## Not run: # get a data.frame df <- cars # write the data.frame to a .df file writeDataFrame("cars.df", df) # specify the concrete types of all columns coltypes <- c("float", "double") # write the data.frame to a .df file with concrete types writeDataFrame("cars.df", df, types = coltypes) ## End(Not run)