Show code cell content
###############################################################################
# The Institute for the Design of Advanced Energy Systems Integrated Platform
# Framework (IDAES IP) was produced under the DOE Institute for the
# Design of Advanced Energy Systems (IDAES).
#
# Copyright (c) 2018-2026 by the software owners: The Regents of the
# University of California, through Lawrence Berkeley National Laboratory,
# National Technology & Engineering Solutions of Sandia, LLC, Carnegie Mellon
# University, West Virginia University Research Corporation, et al.
# All rights reserved. Please see the files COPYRIGHT.md and LICENSE.md
# for full copyright and license information.
###############################################################################
Parameter Estimation Using the NRTL State Block#
Author: Jaffer Ghouse
Maintainer: Stephen Cini
Updated: 2026-06-11
In this module, we use Pyomo’s parmest tool in conjunction with IDAES models for parameter estimation. We demonstrate these tools by estimating the parameters associated with the NRTL property model for a benzene-toluene mixture. The NRTL model has 2 sets of parameters: the non-randomness parameter (alpha_ij) and the binary interaction parameter (tau_ij), where i and j are the pure component species. In this example, we only estimate the binary interaction parameter (tau_ij) for a given dataset. When estimating parameters associated with the property package, IDAES provides the flexibility of doing the parameter estimation by just using the state block or by using a unit model with a specified property package. This module will demonstrate parameter estimation by using only the state block.
We will complete the following tasks:
Set up a method to return an initialized model
Set up the parameter estimation problem using
parmestAnalyze the results
Demonstrate advanced features using
parmest
Key links to documentation:#
# Todo: import ConcreteModel, value, and Suffix from pyomo.environ
from pyomo.environ import ConcreteModel, value, Suffix
# Todo: import FlowsheetBlock from idaes.core
from idaes.core import FlowsheetBlock
In the next cell, we import the parameter block used in this module and the idaes logger.
from idaes.models.properties.activity_coeff_models.BTX_activity_coeff_VLE import (
BTXParameterBlock,
)
import idaes.logger as idaeslog
In the next cell, we import parmest from Pyomo and the pandas package. We need pandas as parmest uses pandas.dataframe for handling the input data and the results.
import pyomo.contrib.parmest.parmest as parmest
import pandas as pd
Setting up an initialized model#
We need to provide a method that returns an initialized model to the parmest tool in Pyomo.
def NRTL_model(data):
# Todo: Create a ConcreteModel object
m = ConcreteModel()
# Todo: Create FlowsheetBlock object
m.fs = FlowsheetBlock(dynamic=False)
# Todo: Create a properties parameter object with the following options:
# "valid_phase": ('Liq', 'Vap')
# "activity_coeff_model": 'NRTL'
m.fs.properties = BTXParameterBlock(
valid_phase=("Liq", "Vap"), activity_coeff_model="NRTL"
)
m.fs.state_block = m.fs.properties.build_state_block(defined_state=True)
# Fix the state variables on the state block
# hint: state variables exist on the state block i.e. on m.fs.state_block
m.fs.state_block.flow_mol.fix(1)
m.fs.state_block.temperature.fix(368)
m.fs.state_block.pressure.fix(101325)
m.fs.state_block.mole_frac_comp["benzene"].fix(0.5)
m.fs.state_block.mole_frac_comp["toluene"].fix(0.5)
# Fix NRTL specific parameters.
# non-randomness parameter - alpha_ij (set at 0.3, 0 if i=j)
m.fs.properties.alpha["benzene", "benzene"].fix(0)
m.fs.properties.alpha["benzene", "toluene"].fix(0.3)
m.fs.properties.alpha["toluene", "toluene"].fix(0)
m.fs.properties.alpha["toluene", "benzene"].fix(0.3)
# binary interaction parameter - tau_ij (0 if i=j, else to be estimated later but fixing to initialize)
m.fs.properties.tau["benzene", "benzene"].fix(0)
m.fs.properties.tau["benzene", "toluene"].fix(-0.9)
m.fs.properties.tau["toluene", "toluene"].fix(0)
m.fs.properties.tau["toluene", "benzene"].fix(1.4)
# Initialize the flash unit
m.fs.state_block.initialize(outlvl=idaeslog.INFO_LOW)
# Fix at actual temperature
if isinstance(data, dict) or isinstance(data, pd.Series):
m.fs.state_block.temperature.fix(float(data["temperature"]))
elif isinstance(data, pd.DataFrame):
m.fs.state_block.temperature.fix(float(data.iloc[0]["temperature"]))
else:
raise ValueError("Unrecognized data type.")
# Set bounds on variables to be estimated
m.fs.properties.tau["benzene", "toluene"].setlb(-5)
m.fs.properties.tau["benzene", "toluene"].setub(5)
m.fs.properties.tau["toluene", "benzene"].setlb(-5)
m.fs.properties.tau["toluene", "benzene"].setub(5)
# Return initialized flash model
return m
Parameter estimation using parmest#
In addition to providing a method to return an initialized model, the parmest tool needs the following:
Experiment class to set up and label model with suffixes
Dataset with multiple scenarios - organized into an experiment list
Here we build an experiment class to label our model problem for parameter estimation. The labels are defined as a Suffix, and the main labels for our model are experiment_outputs, unknown_parameters, and measurement_error.
For this problem, the error will be computed for the mole fraction of benzene in the vapor and liquid phase between the model prediction and data. The experimental_outputs will therefore be the mole fraction of benzene in the two phases.
In this example, we only estimate the binary interaction parameter (tau_ij). Given that this variable is usually indexed as tau_ij = Var(component_list, component_list), there are 2*2=4 degrees of freedom. However, when i=j, the binary interaction parameter is 0. Therefore, in this problem, we estimate the binary interaction parameter for the following variables only:
fs.properties.tau[‘benzene’, ‘toluene’]
fs.properties.tau[‘toluene’, ‘benzene’]
As shown below, these model components are used as our unknown_parameters.
We define measurement_error as none so parmest calculates the value internally based on the experimental outputs. Refer to https://pyomo.readthedocs.io/en/stable/explanation/analysis/parmest/driver.html for more information.
# Build an experiment class to take advantage of new parmest interface
from pyomo.contrib.parmest.experiment import Experiment
class NRTLExperiment(Experiment):
"""Experiment class for parameter estimation of NRTL model using parmest"""
def __init__(self, data, meas_error=None):
"""Initialize the NRTLExperiment class
Args:
data: DataFrame containing the experimental data
meas_error: Measurement error for the data (optional)
"""
self.model = None
self.data = data
self.meas_error = meas_error
def create_model(self):
"""Create the Pyomo model for the NRTL parameter estimation problem"""
self.model = NRTL_model(self.data)
def label_model(self):
from pyomo.environ import Set, Expression
m = self.model
# Parmest expects the first index of experiment outputs to be the data point
# This is a workaround that will be addressed and corrected in a future release.
m.data_point = Set(initialize=[0])
# Wrap IDAES variables in Expressions indexed by data point
m.liq_benzene_out = Expression(
m.data_point,
rule=lambda m, i: m.fs.state_block.mole_frac_phase_comp["Liq", "benzene"],
)
m.vap_benzene_out = Expression(
m.data_point,
rule=lambda m, i: m.fs.state_block.mole_frac_phase_comp["Vap", "benzene"],
)
m.experiment_outputs = Suffix(direction=Suffix.LOCAL)
m.experiment_outputs[m.liq_benzene_out[0]] = float(self.data["liq_benzene"])
m.experiment_outputs[m.vap_benzene_out[0]] = float(self.data["vap_benzene"])
m.measurement_error = Suffix(direction=Suffix.LOCAL)
m.measurement_error.update(
[
(m.liq_benzene_out[0], self.meas_error),
(m.vap_benzene_out[0], self.meas_error),
]
)
# Add unknown parameters to the model for easier access
m.unknown_parameters = Suffix(direction=Suffix.LOCAL)
m.unknown_parameters.update(
(k, value(k))
for k in [
m.fs.properties.tau["benzene", "toluene"],
m.fs.properties.tau["toluene", "benzene"],
]
)
def get_labeled_model(self):
"""Return the labeled model"""
if self.model is None:
self.create_model()
self.label_model()
return self.model
Pyomo’s parmest tool supports the following data formats:
pandas dataframe
list of dictionaries
list of json file names.
Please see the documentation for more details.
For this example, we load data from the csv file BT_NRTL_dataset.csv. The dataset consists of fifty data points which provide the mole fraction of benzene in the vapor and liquid phase as a function of temperature.
# Load data from csv
data = pd.read_csv("BT_NRTL_dataset.csv")
# Display the dataset
display(data)
| temperature | liq_benzene | vap_benzene | |
|---|---|---|---|
| 0 | 365.500000 | 0.480953 | 0.692110 |
| 1 | 365.617647 | 0.462444 | 0.667699 |
| 2 | 365.735294 | 0.477984 | 0.692441 |
| 3 | 365.852941 | 0.440547 | 0.640336 |
| 4 | 365.970588 | 0.427421 | 0.623328 |
| 5 | 366.088235 | 0.442725 | 0.647796 |
| 6 | 366.205882 | 0.434374 | 0.637691 |
| 7 | 366.323529 | 0.444642 | 0.654933 |
| 8 | 366.441176 | 0.427132 | 0.631229 |
| 9 | 366.558824 | 0.446301 | 0.661743 |
| 10 | 366.676471 | 0.438004 | 0.651591 |
| 11 | 366.794118 | 0.425320 | 0.634814 |
| 12 | 366.911765 | 0.439435 | 0.658047 |
| 13 | 367.029412 | 0.435655 | 0.654539 |
| 14 | 367.147059 | 0.401350 | 0.604987 |
| 15 | 367.264706 | 0.397862 | 0.601703 |
| 16 | 367.382353 | 0.415821 | 0.630930 |
| 17 | 367.500000 | 0.420667 | 0.640380 |
| 18 | 367.617647 | 0.391683 | 0.598214 |
| 19 | 367.735294 | 0.404903 | 0.620432 |
| 20 | 367.852941 | 0.409563 | 0.629626 |
| 21 | 367.970588 | 0.389488 | 0.600722 |
| 22 | 368.000000 | 0.396789 | 0.612483 |
| 23 | 368.088235 | 0.398162 | 0.616106 |
| 24 | 368.205882 | 0.362340 | 0.562505 |
| 25 | 368.323529 | 0.386958 | 0.602680 |
| 26 | 368.441176 | 0.363643 | 0.568210 |
| 27 | 368.558824 | 0.368118 | 0.577072 |
| 28 | 368.676471 | 0.384098 | 0.604078 |
| 29 | 368.794118 | 0.353605 | 0.557925 |
| 30 | 368.911765 | 0.346474 | 0.548445 |
| 31 | 369.029412 | 0.350741 | 0.556996 |
| 32 | 369.147059 | 0.362347 | 0.577286 |
| 33 | 369.264706 | 0.362578 | 0.579519 |
| 34 | 369.382353 | 0.340765 | 0.546411 |
| 35 | 369.500000 | 0.337462 | 0.542857 |
| 36 | 369.617647 | 0.355729 | 0.574083 |
| 37 | 369.735294 | 0.348679 | 0.564513 |
| 38 | 369.852941 | 0.338187 | 0.549284 |
| 39 | 369.970588 | 0.324360 | 0.528514 |
| 40 | 370.088235 | 0.310753 | 0.507964 |
| 41 | 370.205882 | 0.311037 | 0.510055 |
| 42 | 370.323529 | 0.311263 | 0.512055 |
| 43 | 370.441176 | 0.308081 | 0.508437 |
| 44 | 370.558824 | 0.308224 | 0.510293 |
| 45 | 370.676471 | 0.318148 | 0.528399 |
| 46 | 370.794118 | 0.308334 | 0.513728 |
| 47 | 370.911765 | 0.317937 | 0.531410 |
| 48 | 371.029412 | 0.289149 | 0.484824 |
| 49 | 371.147059 | 0.298637 | 0.502318 |
We define the exp_list by splitting the data into individual experiments, or data points.
# Update to new interface
exp_list = []
for i in range(data.shape[0]):
exp_list.append(NRTLExperiment(data.iloc[i]))
We are now ready to set up the parameter estimation problem. We will create a parameter estimation object called pest. As shown below, we pass the experiment list, and an objective function to the Estimator method. tee=True will print the solver output after solving the parameter estimation problem.
import logging
idaeslog.getIdaesLogger("core.property_meta").setLevel(logging.ERROR)
pest = parmest.Estimator(exp_list, obj_function="SSE", tee=True)
obj_value, parameters = pest.theta_est()
Ipopt 3.13.2:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
This version of Ipopt was compiled from source code available at
https://github.com/IDAES/Ipopt as part of the Institute for the Design of
Advanced Energy Systems Process Systems Engineering Framework (IDAES PSE
Framework) Copyright (c) 2018-2019. See https://github.com/IDAES/idaes-pse.
This version of Ipopt was compiled using HSL, a collection of Fortran codes
for large-scale scientific computation. All technical papers, sales and
publicity material resulting from use of the HSL codes within IPOPT must
contain the following acknowledgement:
HSL, a collection of Fortran codes for large-scale scientific
computation. See http://www.hsl.rl.ac.uk.
******************************************************************************
This is Ipopt version 3.13.2, running with linear solver ma27.
Number of nonzeros in equality constraint Jacobian...: 3750
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 2200
Total number of variables............................: 1102
variables with only lower bounds: 0
variables with lower and upper bounds: 300
variables with only upper bounds: 0
Total number of equality constraints.................: 1100
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 6.0671019e-03 3.15e+00 1.97e-05 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 8.6249856e-04 1.40e+03 2.15e-01 -1.0 1.37e+04 - 9.95e-01 1.00e+00h 1
2 1.1627234e-03 9.11e+03 8.21e-01 -1.7 4.74e+02 - 8.66e-01 1.00e+00h 1
3 1.0978149e-03 9.02e+03 7.99e-01 -1.7 5.89e+00 -4.0 5.44e-01 2.64e-02h 6
4 8.5702670e-04 8.63e+02 3.17e-02 -1.7 6.93e-01 -2.7 1.00e+00 1.00e+00h 1
5 1.3332724e-03 3.57e+03 7.92e-03 -1.7 7.75e-01 - 1.00e+00 1.00e+00h 1
6 1.5692588e-03 1.64e+02 2.34e-04 -1.7 1.98e-01 - 1.00e+00 1.00e+00h 1
7 1.5828905e-03 1.20e+01 1.13e-05 -1.7 4.29e-02 - 1.00e+00 1.00e+00h 1
8 1.4366151e-03 9.18e-01 2.34e-04 -2.5 5.49e-02 - 1.00e+00 1.00e+00h 1
9 8.9194043e-04 2.19e+01 2.00e-04 -3.8 2.51e-01 - 1.00e+00 1.00e+00h 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 5.9835646e-04 3.72e+01 2.34e-05 -3.8 3.13e-01 - 1.00e+00 1.00e+00h 1
11 5.9839648e-04 1.26e+00 4.06e-08 -3.8 2.87e-02 - 1.00e+00 1.00e+00h 1
12 5.9838953e-04 4.33e-05 1.60e-12 -3.8 9.40e-05 - 1.00e+00 1.00e+00h 1
13 5.9077073e-04 1.21e+01 2.96e-06 -5.7 4.79e-02 - 1.00e+00 1.00e+00h 1
14 5.9066394e-04 1.19e-03 4.25e-07 -5.7 5.98e-04 -3.1 1.00e+00 1.00e+00h 1
15 5.9059984e-04 1.08e-02 2.96e-07 -8.6 1.25e-03 -3.6 1.00e+00 1.00e+00h 1
16 5.9042438e-04 9.67e-02 2.92e-07 -8.6 3.70e-03 -4.1 1.00e+00 1.00e+00h 1
17 5.8992758e-04 8.72e-01 2.91e-07 -8.6 1.11e-02 -4.6 1.00e+00 1.00e+00h 1
18 5.8840706e-04 8.23e+00 3.09e-07 -8.6 3.40e-02 -5.1 1.00e+00 1.00e+00h 1
19 5.8319289e-04 9.25e+01 1.97e-06 -8.6 1.15e-01 -5.5 1.00e+00 1.00e+00h 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
20 5.4836946e-04 3.88e+03 1.22e-04 -8.6 7.95e-01 -6.0 1.00e+00 1.00e+00h 1
21 5.0847689e-04 4.45e+02 1.04e-04 -8.6 3.55e-01 -5.6 1.00e+00 1.00e+00h 1
22 5.0727291e-04 4.20e+00 1.54e-05 -8.6 6.20e+02 - 1.00e+00 1.00e+00h 1
23 5.0749772e-04 1.36e+00 1.44e-06 -8.6 5.13e+01 - 1.00e+00 1.00e+00h 1
24 5.0749686e-04 6.93e-06 4.21e-11 -8.6 3.55e-01 - 1.00e+00 1.00e+00h 1
25 5.0749686e-04 3.36e-06 1.49e-12 -9.0 6.25e-02 - 1.00e+00 1.00e+00h 1
26 5.0749686e-04 1.02e-10 2.36e-18 -9.0 5.08e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 26
(scaled) (unscaled)
Objective...............: 5.0749685787934424e-04 5.0749685787934424e-04
Dual infeasibility......: 2.3583858113882067e-18 2.3583858113882067e-18
Constraint violation....: 3.5405706676501683e-13 1.0186340659856796e-10
Complementarity.........: 9.0909090909091344e-10 9.0909090909091344e-10
Overall NLP error.......: 9.0909090909091344e-10 9.0909090909091344e-10
Number of objective function evaluations = 33
Number of objective gradient evaluations = 27
Number of equality constraint evaluations = 34
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 27
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 26
Total CPU secs in IPOPT (w/o function evaluations) = 0.035
Total CPU secs in NLP function evaluations = 0.009
EXIT: Optimal Solution Found.
You will notice that the resulting parameter estimation problem will have 1102 variables and 1100 constraints. Let us display the results by running the next cell.
print("The SSE at the optimal solution is %0.6f" % (obj_value))
print()
print("The values for the parameters are as follows:")
for k, v in parameters.items():
print(k, "=", v)
The SSE at the optimal solution is 0.000507
The values for the parameters are as follows:
fs.properties.tau[benzene,toluene] = -0.8987550041842163
fs.properties.tau[toluene,benzene] = 1.4104702103547941
Using the data that was provided, we have estimated the binary interaction parameters in the NRTL model for a benzene-toluene mixture. Although the dataset that was provided was temperature dependent, in this example we have estimated a single value that fits best for all temperatures.
Advanced options for parmest: bootstrapping#
Pyomo’s parmest tool allows for bootstrapping where the parameter estimation is repeated over n samples with resampling from the original data set. Parameter estimation with bootstrap resampling can be used to identify confidence regions around each parameter estimate. This analysis can be slow given the increased number of model instances that need to be solved. Please refer to https://pyomo.readthedocs.io/en/stable/contributed_packages/parmest/driver.html for more details.
For the example above, the bootstrapping can be run by uncommenting the code in the following cell:
# Uncomment the following lines
# bootstrap_theta = pest.theta_est_bootstrap(4, seed=542)
# display(bootstrap_theta)