| Title: | Probabilistic Decomposition of Archaeological Palimpsests |
|---|---|
| Description: | Probabilistic framework for the analysis of archaeological palimpsests based on the Stratigraphic Entanglement Field (SEF). Integrates spatial proximity, stratigraphic depth, chronological overlap, and cultural similarity to estimate latent depositional phases via diagonal Gaussian mixture Expectation-Maximisation (EM). Provides the Stratigraphic Entanglement Index (SEI), Excavation Stratigraphic Energy (ESE), and Palimpsest Dissolution Index (PDI) for quantifying depositional coherence, detecting intrusive finds, and measuring palimpsest formation. Includes simulation, diagnostics, phase-count selection, publication-quality plots, and Geographic Information System (GIS) export via 'sf'. Methods are described in Cocca (2026) <https://github.com/enzococca/palimpsestr>. |
| Authors: | Enzo Cocca [aut, cre] (ORCID: <https://orcid.org/0000-0002-8096-2810>) |
| Maintainer: | Enzo Cocca <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.12.0 |
| Built: | 2026-06-01 06:53:49 UTC |
| Source: | https://github.com/enzococca/palimpsestr |
Compares estimated phase assignments against known true labels using the Adjusted Rand Index (Hubert and Arabie, 1985). Values near 1 indicate perfect agreement; values near 0 indicate random agreement.
adjusted_rand_index(object, true_labels)adjusted_rand_index(object, true_labels)
object |
A |
true_labels |
Integer vector of known phase assignments. |
A single numeric value in .
Other validation:
bootstrap_sef(),
confusion_matrix(),
cv_sef(),
optimize_weights()
x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) adjusted_rand_index(fit, x$true_phase)x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) adjusted_rand_index(fit, x$true_phase)
Generates a synthetic excavation dataset with known latent phases, controlled spatial clustering, and configurable inter-phase mixing.
archaeo_sim(n = 150, k = 3, seed = NULL, mixing = 0.08)archaeo_sim(n = 150, k = 3, seed = NULL, mixing = 0.08)
n |
Number of observations (finds). |
k |
Number of latent depositional phases. |
seed |
Optional random seed for reproducibility. |
mixing |
Proportion of observations to perturb spatially and taphonomically, simulating post-depositional disturbance (0–1). |
A data.frame with columns: id, x, y, z,
context, date_min, date_max, class,
taf_score, true_phase.
fit_sef for fitting the SEF model to the output.
easy <- archaeo_sim(n = 100, k = 3, mixing = 0.05, seed = 1) table(easy$true_phase) hard <- archaeo_sim(n = 200, k = 4, mixing = 0.50, seed = 1) table(hard$true_phase)easy <- archaeo_sim(n = 100, k = 3, mixing = 0.05, seed = 1) table(easy$true_phase) hard <- archaeo_sim(n = 200, k = 4, mixing = 0.50, seed = 1) table(hard$true_phase)
Returns a data.frame combining dominant phase assignments, membership probabilities, entropy, local SEI, and energy.
as_phase_table(object)as_phase_table(object)
object |
A |
A data.frame with one row per find.
predict_phase, phase_diagnostic_table
Other diagnostics:
detect_intrusions(),
ese(),
pdi(),
phase_diagnostic_table(),
predict_phase()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) head(as_phase_table(fit))x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) head(as_phase_table(fit))
Wraps ggplotly with archaeological tooltips showing
find ID, context, phase probability, dating, class, entropy, and energy.
as_plotly(gg, tooltip = "text", ...)as_plotly(gg, tooltip = "text", ...)
gg |
A ggplot object produced by any |
tooltip |
Character vector of aesthetics to show. Defaults to
|
... |
Additional arguments passed to |
A plotly htmlwidget object.
gg_phasefield, gg_entropy,
gg_energy, gg_intrusions
Other plotting:
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) if (requireNamespace("ggplot2", quietly = TRUE) && requireNamespace("plotly", quietly = TRUE)) { p <- as_plotly(gg_phasefield(fit)) }x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) if (requireNamespace("ggplot2", quietly = TRUE) && requireNamespace("plotly", quietly = TRUE)) { p <- as_plotly(gg_phasefield(fit)) }
Extracts the strongest pairwise SEI connections as sf line geometries.
as_sf_links(object, quantile_threshold = 0.9, crs = NA_integer_)as_sf_links(object, quantile_threshold = 0.9, crs = NA_integer_)
object |
A |
quantile_threshold |
Quantile for retaining strongest links (default: 0.9). |
crs |
CRS for the output geometry. |
An sf object with columns from, to,
sei, and geometry.
Other GIS:
as_sf_phase()
if (requireNamespace("sf", quietly = TRUE)) { x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) links <- as_sf_links(fit) }if (requireNamespace("sf", quietly = TRUE)) { x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) links <- as_sf_links(fit) }
Creates an sf point object with phase assignments and diagnostics
for use in QGIS or spatial analysis.
as_sf_phase(object, crs = NA_integer_, dims = c("XY", "XYZ"))as_sf_phase(object, crs = NA_integer_, dims = c("XY", "XYZ"))
object |
A |
crs |
CRS passed to |
dims |
Either |
An sf object.
as_sf_links, phase_diagnostic_table
Other GIS:
as_sf_links()
if (requireNamespace("sf", quietly = TRUE)) { x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pts <- as_sf_phase(fit) }if (requireNamespace("sf", quietly = TRUE)) { x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pts <- as_sf_phase(fit) }
Resamples the data with replacement, refits the model, and computes the distribution of key statistics (PDI, mean entropy, mean energy, ARI if true labels provided). Returns percentile confidence intervals.
bootstrap_sef( object, n_boot = 100, conf = 0.95, true_labels = NULL, verbose = TRUE )bootstrap_sef( object, n_boot = 100, conf = 0.95, true_labels = NULL, verbose = TRUE )
object |
A |
n_boot |
Number of bootstrap replicates (default: 100). |
conf |
Confidence level (default: 0.95). |
true_labels |
Optional integer vector of true phase labels for ARI. |
verbose |
Print progress (default: TRUE). |
A data.frame with columns statistic, estimate,
lower, upper, se.
fit_sef, pdi,
adjusted_rand_index
Other validation:
adjusted_rand_index(),
confusion_matrix(),
cv_sef(),
optimize_weights()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2, seed = 1) bootstrap_sef(fit, n_boot = 20)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2, seed = 1) bootstrap_sef(fit, n_boot = 20)
Fits the SEF model for each value of K and returns a summary table with BIC, PDI, entropy, energy, and other diagnostics.
compare_k(data, k_values = 2:6, ...)compare_k(data, k_values = 2:6, ...)
data |
Input data.frame. |
k_values |
Integer vector of candidate phase counts. |
... |
Additional arguments passed to |
A data.frame with one row per K value.
Other fitting:
fit_sef(),
reorder_phases(),
sef_summary()
x <- archaeo_sim(n = 100, k = 3, seed = 1) ck <- compare_k(x, k_values = 2:4) print(ck)x <- archaeo_sim(n = 100, k = 3, seed = 1) ck <- compare_k(x, k_values = 2:4) print(ck)
Cross-tabulates estimated phase assignments against known true labels. Phases are reordered to maximise diagonal agreement (Hungarian matching).
confusion_matrix(object, true_labels)confusion_matrix(object, true_labels)
object |
A |
true_labels |
Integer vector of known phase assignments. |
A matrix with estimated phases as rows and true phases as columns.
Other validation:
adjusted_rand_index(),
bootstrap_sef(),
cv_sef(),
optimize_weights()
x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) confusion_matrix(fit, x$true_phase)x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) confusion_matrix(fit, x$true_phase)
Splits the data into folds, fits on training folds, and evaluates log-likelihood on the held-out fold. Useful for comparing different K values or weight configurations.
cv_sef(data, k_values = 2:6, n_folds = 5, seed = 1, ...)cv_sef(data, k_values = 2:6, n_folds = 5, seed = 1, ...)
data |
A data.frame with archaeological find data. |
k_values |
Integer vector of candidate phase counts. |
n_folds |
Number of cross-validation folds (default: 5). |
seed |
Random seed for fold assignment. |
... |
Additional arguments passed to |
A data.frame with columns k, fold,
train_loglik, test_loglik, train_pdi.
Other validation:
adjusted_rand_index(),
bootstrap_sef(),
confusion_matrix(),
optimize_weights()
x <- archaeo_sim(n = 100, k = 3, seed = 1) cv <- cv_sef(x, k_values = 2:4, n_folds = 3) # Mean test log-likelihood per K aggregate(test_loglik ~ k, data = cv, FUN = mean)x <- archaeo_sim(n = 100, k = 3, seed = 1) cv <- cv_sef(x, k_values = 2:4, n_folds = 3) # Mean test log-likelihood per K aggregate(test_loglik ~ k, data = cv, FUN = mean)
Simulated dataset with 250 artefacts, 4 phases, and 50% mixing. Represents a heavily disturbed deposit where phases are difficult to separate.
demo_compresseddemo_compressed
A data.frame with 250 rows and 10 columns. See demo_easy
for column descriptions.
data(demo_compressed) ck <- compare_k(demo_compressed, k_values = 2:6) print(ck)data(demo_compressed) ck <- compare_k(demo_compressed, k_values = 2:6) print(ck)
Simulated dataset with 150 artefacts, 3 phases, and 5% mixing. Represents a site where occupation phases are clearly distinguishable.
demo_easydemo_easy
A data.frame with 150 rows and 10 columns:
Artefact identifier
Planimetric coordinates (metres)
Depth (metres)
Stratigraphic unit label
Chronological interval (CE)
Cultural class (ceramic, lithic, bone, metal)
Taphonomic disturbance score (0-1)
Known phase assignment (for validation)
data(demo_easy) fit <- fit_sef(demo_easy, k = 3) plot_phasefield(fit)data(demo_easy) fit <- fit_sef(demo_easy, k = 3) plot_phasefield(fit)
Simulated dataset with 200 artefacts, 3 phases, and 30% mixing. Represents a site with significant but resolvable depositional mixing.
demo_moderatedemo_moderate
A data.frame with 200 rows and 10 columns. See demo_easy
for column descriptions.
data(demo_moderate) fit <- fit_sef(demo_moderate, k = 3, tafonomy = "taf_score", context = "context") summary(fit)data(demo_moderate) fit <- fit_sef(demo_moderate, k = 3, tafonomy = "taf_score", context = "context") summary(fit)
Combines rescaled entropy, energy, and inverse local SEI into a composite intrusion probability score.
detect_intrusions(object)detect_intrusions(object)
object |
A |
A data.frame with columns id and intrusion_prob.
Other diagnostics:
as_phase_table(),
ese(),
pdi(),
phase_diagnostic_table(),
predict_phase()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) di <- detect_intrusions(fit) head(di[order(di$intrusion_prob, decreasing = TRUE), ])x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) di <- detect_intrusions(fit) head(di[order(di$intrusion_prob, decreasing = TRUE), ])
Measures local depositional disruption for each find by summing weighted dissimilarities with neighbours.
ese( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", beta = c(1, 1, 1, 1), neighbourhood = NULL )ese( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", beta = c(1, 1, 1, 1), neighbourhood = NULL )
data |
Input data.frame. |
coords |
Character vector of coordinate column names. |
chrono |
Character vector with minimum and maximum dating columns. |
class_col |
Class column name. |
beta |
Numeric vector of length 4: weights for spatial, vertical, temporal, and class mismatch. |
neighbourhood |
Maximum XY distance for neighbour inclusion.
When |
A numeric vector of local energy values.
Other diagnostics:
as_phase_table(),
detect_intrusions(),
pdi(),
phase_diagnostic_table(),
predict_phase()
x <- archaeo_sim(n = 30, k = 2, seed = 1) e <- ese(x) summary(e)x <- archaeo_sim(n = 30, k = 2, seed = 1) e <- ese(x) summary(e)
Writes phase assignments, intrusion scores, US summary, and model summary to CSV files in the specified directory.
export_results(object, dir, format = "csv", prefix = "palimpsestr")export_results(object, dir, format = "csv", prefix = "palimpsestr")
object |
A |
dir |
Output directory (created if it does not exist). Must be
specified explicitly; consider using |
format |
Export format: |
prefix |
File name prefix (default: |
Invisibly returns a character vector of written file paths.
us_summary_table, as_phase_table
Other export:
phase_transition_matrix(),
us_summary_table()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2, context = "context") export_results(fit, dir = tempdir())x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2, context = "context") export_results(fit, dir = tempdir())
Estimates latent depositional phases from spatial, stratigraphic, chronological, and cultural evidence using diagonal Gaussian mixture EM.
fit_sef( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class = "class", tafonomy = NULL, context = NULL, harris = NULL, k = 3, weights = c(ws = 1, wz = 1, wt = 1, wc = 1), seed = 1, em_iter = 100, em_tol = 1e-05, n_init = 1, chrono_precision = FALSE, taf_as_feature = FALSE, residuality = FALSE, class_scale = FALSE, subclass = NULL )fit_sef( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class = "class", tafonomy = NULL, context = NULL, harris = NULL, k = 3, weights = c(ws = 1, wz = 1, wt = 1, wc = 1), seed = 1, em_iter = 100, em_tol = 1e-05, n_init = 1, chrono_precision = FALSE, taf_as_feature = FALSE, residuality = FALSE, class_scale = FALSE, subclass = NULL )
data |
A data.frame with archaeological find data. |
coords |
Character vector of length 3 naming the x, y, z coordinate columns. |
chrono |
Character vector of length 2 naming the min and max dating columns. |
class |
Character scalar naming the material class column. |
tafonomy |
Optional column name for taphonomic disturbance scores (0–1). |
context |
Optional column name for stratigraphic unit labels. |
harris |
Optional |
k |
Integer number of phases to estimate. |
weights |
Named numeric vector with components |
seed |
Random seed for reproducibility. |
em_iter |
Maximum number of EM iterations (default: 100). |
em_tol |
Convergence tolerance on the log-likelihood. |
n_init |
Number of random initialisations. The run with the highest log-likelihood is retained (default: 1). |
chrono_precision |
Logical. If TRUE, add 1/tspan as a feature giving higher weight to precisely dated finds (default: FALSE). |
taf_as_feature |
Logical. If TRUE and |
residuality |
Logical. If TRUE and |
class_scale |
Logical. If TRUE, scale one-hot class columns by
|
subclass |
Optional column name for sub-class labels. If provided,
used instead of |
An S3 object of class sef_fit.
archaeo_sim, compare_k,
pdi, detect_intrusions
Other fitting:
compare_k(),
reorder_phases(),
sef_summary()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) print(fit) x <- archaeo_sim(n = 150, k = 3, seed = 42) fit <- fit_sef(x, k = 3, tafonomy = "taf_score", context = "context") summary(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) print(fit) x <- archaeo_sim(n = 150, k = 3, seed = 42) fit <- fit_sef(x, k = 3, tafonomy = "taf_score", context = "context") summary(fit)
Visualises bootstrap confidence intervals for key SEF diagnostics (PDI, entropy, energy, log-likelihood, and optionally ARI).
gg_bootstrap(bs_result)gg_bootstrap(bs_result)
bs_result |
Data.frame returned by |
A ggplot object.
Other plotting:
as_plotly(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Three-panel plot showing BIC, PDI, and Mean Entropy for different K values.
gg_compare_k(ck)gg_compare_k(ck)
ck |
Data.frame from |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Plots a heatmap of the confusion matrix between estimated and known true phase assignments.
gg_confusion(object, true_labels)gg_confusion(object, true_labels)
object |
A |
true_labels |
Integer vector of known true phase assignments. |
A ggplot object.
confusion_matrix, adjusted_rand_index
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) gg_confusion(fit, x$true_phase) }if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1, mixing = 0.05) fit <- fit_sef(x, k = 3, seed = 1) gg_confusion(fit, x$true_phase) }
Plots the log-likelihood at each EM iteration to verify convergence.
gg_convergence(object)gg_convergence(object)
object |
A |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) gg_convergence(fit) }if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) gg_convergence(fit) }
Plots mean test and training log-likelihood across K values from
cv_sef output.
gg_cv(cv_result)gg_cv(cv_result)
cv_result |
Data.frame returned by |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Shows ESE values across space. High energy indicates zones where neighbouring artefacts are dissimilar (depositional disruption).
gg_energy(object, xlabel = "Easting (m)", ylabel = "Northing (m)")gg_energy(object, xlabel = "Easting (m)", ylabel = "Northing (m)")
object |
A |
xlabel, ylabel
|
Axis labels. |
A ggplot object.
gg_entropy,
as_plotly for interactive version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Shows Shannon entropy of phase probabilities across the excavation area. High entropy = uncertain phase assignment (possible palimpsest zone).
gg_entropy(object, xlabel = "Easting (m)", ylabel = "Northing (m)")gg_entropy(object, xlabel = "Easting (m)", ylabel = "Northing (m)")
object |
A |
xlabel, ylabel
|
Axis labels. |
A ggplot object.
plot_entropy for base R version,
as_plotly for interactive version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Maps the probability that each find is an intrusion (displaced or redeposited). Top suspects are circled and labelled.
gg_intrusions( object, top_n = 5, xlabel = "Easting (m)", ylabel = "Northing (m)" )gg_intrusions( object, top_n = 5, xlabel = "Easting (m)", ylabel = "Northing (m)" )
object |
A |
top_n |
Number of top intrusions to label. |
xlabel, ylabel
|
Axis labels. |
A ggplot object.
detect_intrusions,
as_plotly for interactive version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Displays SEF analysis results directly on the excavation plan. Polygons represent stratigraphic units; points represent individual finds.
gg_map( object, geometries, layer = c("phase", "entropy", "energy", "intrusion"), show_labels = TRUE, show_points = TRUE, xlabel = NULL, ylabel = NULL )gg_map( object, geometries, layer = c("phase", "entropy", "energy", "intrusion"), show_labels = TRUE, show_points = TRUE, xlabel = NULL, ylabel = NULL )
object |
A |
geometries |
An sf object with excavation plan polygons.
Must contain a |
layer |
What to display: |
show_labels |
Show US labels on polygons. |
show_points |
Overlay individual finds as points. |
xlabel, ylabel
|
Axis labels (default: NULL = use CRS units). |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Plots finds along the depth (z) axis, coloured by phase assignment, to visualise the stratigraphic ordering of phases.
gg_phase_profile(object)gg_phase_profile(object)
object |
A |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) gg_phase_profile(fit) }if (requireNamespace("ggplot2", quietly = TRUE)) { x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) gg_phase_profile(fit) }
Plots artefact positions coloured by dominant phase, with point size proportional to assignment confidence.
gg_phasefield(object, xlabel = "Easting (m)", ylabel = "Northing (m)")gg_phasefield(object, xlabel = "Easting (m)", ylabel = "Northing (m)")
object |
A |
xlabel, ylabel
|
Axis labels (default: "Easting (m)" / "Northing (m)"). |
A ggplot object.
plot_phasefield for base R version,
as_plotly for interactive version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Plots the mean test log-likelihood across weight configurations
from optimize_weights output.
gg_weights(opt_result)gg_weights(opt_result)
opt_result |
List returned by |
A ggplot object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
plot_energy(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
Infers vertical ordering between stratigraphic units from the mean depth of finds in each context, and builds a penalty matrix that discourages finds from different vertical zones being assigned to the same phase.
harris_from_contexts( data, z_col = "z", context_col = "context", penalty_scale = 0.5 )harris_from_contexts( data, z_col = "z", context_col = "context", penalty_scale = 0.5 )
data |
A data.frame with find data. |
z_col |
Name of the depth column. |
context_col |
Name of the context column. |
penalty_scale |
Penalty magnitude for cross-context assignments. |
An symmetric penalty matrix.
Other harris:
read_harris(),
validate_phases_harris()
x <- archaeo_sim(n = 60, k = 2, seed = 1) H <- harris_from_contexts(x, z_col = "z", context_col = "context") dim(H)x <- archaeo_sim(n = 60, k = 2, seed = 1) H <- harris_from_contexts(x, z_col = "z", context_col = "context") dim(H)
Opens an interactive Shiny application for loading data, fitting the SEF model, and exploring results through plots, tables, and maps.
launch_app()launch_app()
The app requires the following packages (installed but not
imported by palimpsestr): shiny, shinydashboard, DT.
For interactive plots install plotly. For database connections
install DBI plus RSQLite or RPostgres. For GIS
maps install sf.
Launches the Shiny app (does not return).
## Not run: launch_app() ## End(Not run)## Not run: launch_app() ## End(Not run)
Reads stratigraphic unit polygons from Shapefile, GeoPackage,
GeoJSON, or a PostGIS database for use with gg_map.
load_geometries( source, layer = NULL, query = NULL, us_column = "us", crs = NULL )load_geometries( source, layer = NULL, query = NULL, us_column = "us", crs = NULL )
source |
File path or |
layer |
Layer name for multi-layer sources. |
query |
SQL query for database connections. |
us_column |
Column containing US/context identifiers. |
crs |
Target CRS for reprojection (optional). |
An sf object with a us column.
Other data-import:
read_db()
Aggregates the SEI matrix by row, yielding a per-observation measure of total depositional coherence with all other finds.
local_sei(sei_mat)local_sei(sei_mat)
sei_mat |
A symmetric SEI matrix from |
A numeric vector of length nrow(sei_mat).
Other SEI:
sei_matrix(),
sei_sparse()
x <- archaeo_sim(n = 30, k = 2, seed = 1) S <- sei_matrix(x) lsei <- local_sei(S) summary(lsei)x <- archaeo_sim(n = 30, k = 2, seed = 1) S <- sei_matrix(x) lsei <- local_sei(S) summary(lsei)
Tests a grid of weight configurations and selects the one that maximises the mean held-out log-likelihood across folds. This provides a data-driven alternative to manual weight specification.
optimize_weights( data, k, weight_grid = NULL, n_folds = 3, seed = 1, verbose = TRUE, ... )optimize_weights( data, k, weight_grid = NULL, n_folds = 3, seed = 1, verbose = TRUE, ... )
data |
A data.frame with archaeological find data. |
k |
Number of phases. |
weight_grid |
A data.frame with columns |
n_folds |
Number of cross-validation folds (default: 3). |
seed |
Random seed. |
verbose |
Print progress (default: TRUE). |
... |
Additional arguments passed to |
A list with components:
Named numeric vector of optimal weights.
Data.frame with all tested configurations and their scores.
Other validation:
adjusted_rand_index(),
bootstrap_sef(),
confusion_matrix(),
cv_sef()
x <- archaeo_sim(n = 80, k = 3, seed = 1) opt <- optimize_weights(x, k = 3, n_folds = 3) opt$best_weightsx <- archaeo_sim(n = 80, k = 3, seed = 1) opt <- optimize_weights(x, k = 3, n_folds = 3) opt$best_weights
Measures global phase separability as .
Values close to 1 indicate well-separated phases; values near 0 indicate
a compressed palimpsest.
pdi(object)pdi(object)
object |
A |
A single numeric value between 0 and 1.
Other diagnostics:
as_phase_table(),
detect_intrusions(),
ese(),
phase_diagnostic_table(),
predict_phase()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pdi(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pdi(fit)
Combines the input data with dominant phase, phase probabilities, entropy, local SEI, and energy in a single data.frame.
phase_diagnostic_table(object)phase_diagnostic_table(object)
object |
A |
A data.frame with all input columns plus diagnostics.
Other diagnostics:
as_phase_table(),
detect_intrusions(),
ese(),
pdi(),
predict_phase()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pdt <- phase_diagnostic_table(fit) names(pdt)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) pdt <- phase_diagnostic_table(fit) names(pdt)
Computes how often finds from phase are found directly
above finds from phase in the vertical sequence,
revealing the stratigraphic ordering of phases.
phase_transition_matrix(object)phase_transition_matrix(object)
object |
A |
A matrix where entry [i,j]
counts transitions from phase i (above) to phase j (below).
Other export:
export_results(),
us_summary_table()
x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) phase_transition_matrix(fit)x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) phase_transition_matrix(fit)
Plot local Excavation Stratigraphic Energy (base R)
plot_energy(object)plot_energy(object)
object |
A |
Invisibly returns the object.
gg_energy for the ggplot2 version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_entropy(),
plot_phasefield(),
plot_sei_profile()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_energy(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_energy(fit)
Plot entropy across space (base R)
plot_entropy(object)plot_entropy(object)
object |
A |
Invisibly returns the object.
gg_entropy for the ggplot2/plotly version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_phasefield(),
plot_sei_profile()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_entropy(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_entropy(fit)
Plot dominant phase assignment (base R)
plot_phasefield(object)plot_phasefield(object)
object |
A |
Invisibly returns the object.
gg_phasefield for the ggplot2/plotly version.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_sei_profile()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_phasefield(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_phasefield(fit)
Plot ordered SEI profile (base R)
plot_sei_profile(object)plot_sei_profile(object)
object |
A |
Invisibly returns the object.
Other plotting:
as_plotly(),
gg_bootstrap(),
gg_compare_k(),
gg_confusion(),
gg_convergence(),
gg_cv(),
gg_energy(),
gg_entropy(),
gg_intrusions(),
gg_map(),
gg_phase_profile(),
gg_phasefield(),
gg_weights(),
plot_energy(),
plot_entropy(),
plot_phasefield()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_sei_profile(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) plot_sei_profile(fit)
Convenience alias for as_phase_table.
predict_phase(object)predict_phase(object)
object |
A |
A data.frame with probabilities and diagnostics.
Other diagnostics:
as_phase_table(),
detect_intrusions(),
ese(),
pdi(),
phase_diagnostic_table()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) head(predict_phase(fit))x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) head(predict_phase(fit))
Loads find data from any DBI-compatible database and maps columns
to the format expected by fit_sef.
read_db( conn, query = NULL, table = NULL, col_id = "id", col_x = "x", col_y = "y", col_z = "z", col_context = "context", col_date_min = "date_min", col_date_max = "date_max", col_class = "class", col_taf = NULL, schema = "custom", sito = NULL )read_db( conn, query = NULL, table = NULL, col_id = "id", col_x = "x", col_y = "y", col_z = "z", col_context = "context", col_date_min = "date_min", col_date_max = "date_max", col_class = "class", col_taf = NULL, schema = "custom", sito = NULL )
conn |
A |
query |
SQL query returning data with the required columns. |
table |
Table name (alternative to |
col_id, col_x, col_y, col_z, col_context, col_date_min, col_date_max, col_class, col_taf
|
Column name mappings. |
schema |
Use |
sito |
Site filter for PyArchInit schema. |
A data.frame ready for fit_sef.
Other data-import:
load_geometries()
Reads a CSV file with columns from, to, and optionally
weight, and converts it to an penalty
matrix aligned with the find-level data.
read_harris(file, contexts, default_weight = 1)read_harris(file, contexts, default_weight = 1)
file |
Path to CSV with columns |
contexts |
Character vector of context labels for each find (length = number of finds). |
default_weight |
Weight for edges without an explicit weight. |
An penalty matrix.
Other harris:
harris_from_contexts(),
validate_phases_harris()
Relabels phases so that phase 1 corresponds to the deepest (oldest) stratum and phase K to the shallowest (most recent). This ensures consistent, interpretable phase numbering across different runs.
reorder_phases(object)reorder_phases(object)
object |
A |
A sef_fit object with reordered phases.
Other fitting:
compare_k(),
fit_sef(),
sef_summary()
x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) fit <- reorder_phases(fit)x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3) fit <- reorder_phases(fit)
Produces a structured Markdown report covering phase composition, intrusion detection, stratigraphic unit purity, and recommendations. Available in English and Italian.
report_sef(object, lang = c("en", "it"), file = NULL)report_sef(object, lang = c("en", "it"), file = NULL)
object |
A |
lang |
Language: |
file |
Optional file path to save the report. |
Character string with the report text (invisibly).
x <- archaeo_sim(n = 100, k = 3, seed = 1) fit <- fit_sef(x, k = 3) report_sef(fit, lang = "en")x <- archaeo_sim(n = 100, k = 3, seed = 1) fit <- fit_sef(x, k = 3) report_sef(fit, lang = "en")
Returns a named list with global diagnostics and phase counts.
sef_summary(object)sef_summary(object)
object |
A |
A named list.
Other fitting:
compare_k(),
fit_sef(),
reorder_phases()
x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) sef_summary(fit)x <- archaeo_sim(n = 60, k = 2, seed = 1) fit <- fit_sef(x, k = 2) sef_summary(fit)
Builds an symmetric matrix quantifying pairwise
depositional coherence from spatial, vertical, temporal, and cultural evidence.
sei_matrix( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", weights = c(ws = 1, wz = 1, wt = 1, wc = 1), eps = 1e-09, z_floor = 0.25, max_dist = NULL )sei_matrix( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", weights = c(ws = 1, wz = 1, wt = 1, wc = 1), eps = 1e-09, z_floor = 0.25, max_dist = NULL )
data |
Input data.frame. |
coords |
Character vector of coordinate column names (x, y, z). |
chrono |
Character vector with minimum and maximum dating columns. |
class_col |
Class column name. |
weights |
Named numeric vector with components |
eps |
Small value to avoid division by zero in spatial distance. |
z_floor |
Minimum vertical denominator. |
max_dist |
Maximum spatial distance for pair inclusion. When
|
A symmetric numeric matrix with zero diagonal.
SEI values are normalised within each dataset. Absolute SEI values are not directly comparable across different excavations or datasets of different sizes. Use SEI for within-dataset ranking and relative comparisons only.
Other SEI:
local_sei(),
sei_sparse()
x <- archaeo_sim(n = 30, k = 2, seed = 1) S <- sei_matrix(x) dim(S)x <- archaeo_sim(n = 30, k = 2, seed = 1) S <- sei_matrix(x) dim(S)
Wrapper around sei_matrix that automatically sets
max_dist when the dataset exceeds a size threshold,
using the 25th percentile of pairwise distances as the cutoff.
sei_sparse( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", weights = c(ws = 1, wz = 1, wt = 1, wc = 1), eps = 1e-09, z_floor = 0.25, n_threshold = 1000 )sei_sparse( data, coords = c("x", "y", "z"), chrono = c("date_min", "date_max"), class_col = "class", weights = c(ws = 1, wz = 1, wt = 1, wc = 1), eps = 1e-09, z_floor = 0.25, n_threshold = 1000 )
data |
Input data.frame. |
coords |
Character vector of coordinate column names (x, y, z). |
chrono |
Character vector with minimum and maximum dating columns. |
class_col |
Class column name. |
weights |
Named numeric vector with components |
eps |
Small value to avoid division by zero in spatial distance. |
z_floor |
Minimum vertical denominator. |
n_threshold |
Datasets larger than this trigger sparse mode (default: 1000). |
A numeric matrix (same as sei_matrix).
Other SEI:
local_sei(),
sei_matrix()
x <- archaeo_sim(n = 50, k = 2, seed = 1) S <- sei_sparse(x)x <- archaeo_sim(n = 50, k = 2, seed = 1) S <- sei_sparse(x)
Summarise a fitted SEF model
## S3 method for class 'sef_fit' summary(object, ...)## S3 method for class 'sef_fit' summary(object, ...)
object |
A |
... |
Ignored. |
A named list.
Aggregates finds by context (US), reporting the dominant phase, purity (proportion of finds in dominant phase), mean entropy, mean energy, and intrusion count.
us_summary_table(object)us_summary_table(object)
object |
A |
A data.frame with one row per stratigraphic unit.
export_results, phase_diagnostic_table
Other export:
export_results(),
phase_transition_matrix()
x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3, context = "context") us_summary_table(fit)x <- archaeo_sim(n = 80, k = 3, seed = 1) fit <- fit_sef(x, k = 3, context = "context") us_summary_table(fit)
Checks whether the estimated phases follow the expected vertical ordering within each stratigraphic unit pair.
validate_phases_harris(object)validate_phases_harris(object)
object |
A |
A data.frame with one row per context pair, indicating whether the dominant phase ordering is consistent with depth.
harris_from_contexts, us_summary_table
Other harris:
harris_from_contexts(),
read_harris()
x <- archaeo_sim(n = 60, k = 3, seed = 1) fit <- fit_sef(x, k = 3, context = "context") validate_phases_harris(fit)x <- archaeo_sim(n = 60, k = 3, seed = 1) fit <- fit_sef(x, k = 3, context = "context") validate_phases_harris(fit)
Real archaeological dataset from the site of Poggio Gramignano (Lugnano in Teverina, TR, Italy), a multi-period Roman villa with an annexed Late Antique infant cemetery. The dataset contains 615 inventoried finds from 54 stratigraphic units spanning from the Pre-Roman period (7th–4th c. BCE) through the Late Antique (5th–6th c. CE).
villa_romanavilla_romana
A data.frame with 615 rows and 9 variables:
Unique find identifier (VRPG_0001 to VRPG_0615)
Easting coordinate (metres, EPSG:3004)
Northing coordinate (metres, EPSG:3004)
Mean elevation of the stratigraphic unit (metres a.s.l.)
Stratigraphic unit label (e.g. US_76, US_117)
Start of chronological interval (BCE as negative, CE as positive)
End of chronological interval
Material class (e.g. Reperto Ceramico, Reperto Anforaceo)
Taphonomic integrity score (0 = fully disturbed, 1 = pristine in situ)
Chronological data derives from the site periodisation. Coordinates are US centroids in Monte Mario / Italy zone 2 (EPSG:3004). Elevation (z) is the mean of recorded quotes per US. Taphonomic scores were assigned by the excavator based on depositional context (primary, secondary, redeposition, backfill, disturbance).
Data from the pyArchInit database of Poggio Gramignano (VRPG 3004). Taphonomic scores compiled by Roberto Montagnetti.
data(villa_romana) str(villa_romana) table(villa_romana$class)data(villa_romana) str(villa_romana) table(villa_romana$class)