matchzoo.engine package¶
Submodules¶
matchzoo.engine.base_model module¶
Base Model.
-
class
matchzoo.engine.base_model.
BaseModel
(params=None, backend=None)¶ 基类:
abc.ABC
Abstract base class of all matchzoo models.
-
BACKEND_FILENAME
= 'backend.h5'¶
-
PARAMS_FILENAME
= 'params.dill'¶
-
backend
¶ return model backend, a keras model instance.
返回类型: Model
-
build
()¶ Build model, each sub class need to impelemnt this method.
Example
>>> BaseModel() Traceback (most recent call last): ... TypeError: Can't instantiate abstract class BaseModel ... >>> class MyModel(BaseModel): ... def build(self): ... pass >>> MyModel <class 'matchzoo.engine.base_model.MyModel'>
-
compile
()¶ Compile model for training.
-
evaluate
(x, y, batch_size=128, verbose=1)¶ Evaluate the model.
See
keras.models.Model.evaluate()
for more details.参数: - x (
Union
[ndarray
,List
[ndarray
]]) -- input data - y (
ndarray
) -- labels - batch_size (
int
) -- number of samples per gradient update - verbose (
int
) -- verbosity mode, 0 or 1
返回类型: Union
[float
,List
[float
]]返回: scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.backend.metrics_names will give you the display labels for the scalar outputs.
- x (
-
fit
(x, y, batch_size=128, epochs=1, verbose=1)¶ Fit the model.
See
keras.models.Model.fit()
for more details.参数: - x (
Union
[ndarray
,List
[ndarray
]]) -- input data. - y (
ndarray
) -- labels. - batch_size (
int
) -- number of samples per gradient update. - epochs (
int
) -- number of epochs to train the model. - verbose (
int
) -- 0, 1, or 2. Verbosity mode. 0 = silent, 1 = verbose, 2 = one log line per epoch.
返回类型: History
返回: A keras.callbacks.History instance. Its history attribute contains all information collected during training.
- x (
-
classmethod
get_default_params
()¶ Model default parameters.
- The common usage is to instantiate
matchzoo.engine.ModelParams
- first, then set the model specific parametrs.
Examples
>>> class MyModel(BaseModel): ... def build(self): ... print(self._params['num_eggs'], 'eggs') ... print('and', self._params['ham_type']) ... ... @classmethod ... def get_default_params(cls): ... params = engine.ParamTable() ... params.add(engine.Param('num_eggs', 512)) ... params.add(engine.Param('ham_type', 'Parma Ham')) ... return params >>> my_model = MyModel() >>> my_model.build() 512 eggs and Parma Ham
Notice that all parameters must be serialisable for the entire model to be serialisable. Therefore, it's strongly recommended to use python native data types to store parameters.
返回类型: ParamTable
返回: model parameters - The common usage is to instantiate
-
guess_and_fill_missing_params
()¶ Guess and fill missing parameters in
params
.Note: likely to be moved to a higher level API in the future.
-
params
¶ return -- model parameters.
返回类型: ParamTable
-
predict
(x, batch_size=128)¶ Generate output predictions for the input samples.
See
keras.models.Model.predict()
for more details.参数: - x (
Union
[ndarray
,List
[ndarray
]]) -- input data - batch_size -- number of samples per gradient update
返回类型: ndarray
返回: numpy array(s) of predictions
- x (
-
save
(dirpath)¶ Save the model.
A saved model is represented as a directory with two files. One is a model parameters file saved by pickle, and the other one is a model h5 file saved by keras.
参数: dirpath ( Union
[str
,Path
]) -- directory path of the saved model
-
-
matchzoo.engine.base_model.
load_model
(dirpath)¶ Load a model. The reverse function of
BaseModel.save()
.参数: dirpath ( Union
[str
,Path
]) -- directory path of the saved model返回类型: BaseModel
返回: a BaseModel
instance
matchzoo.engine.base_preprocessor module¶
Base Preprocessor, consist of multiple ProcessorUnit
.
Each sub-class should employ a sequence of ProcessorUnit
and
StatefulProcessorUnit
to handle input data.
-
class
matchzoo.engine.base_preprocessor.
BasePreprocessor
¶ 基类:
object
Abstract base class for model-wise processors.
-
fit_transform
(text_left, text_right, labels)¶ Apply fit-transform on input data.
This method is an abstract method, need to be implemented in sub-class.
-
handle
(process_unit, input)¶ Inference whether a process_unit is Stateful.
参数: - process_unit (
ProcessorUnit
) -- Given a process unit instance. - input (
Any
) -- process input text.
Return ctx: Context as dict, i.e. fitted parameters.
返回类型: Union
[dict
,Any
]返回: Transformed user input given transformer.
- process_unit (
-
matchzoo.engine.base_task module¶
Base task.
-
class
matchzoo.engine.base_task.
BaseTask
¶ 基类:
abc.ABC
Base Task, shouldn't be used directly.
-
classmethod
list_available_losses
()¶ 返回类型: list
返回: a list of available losses.
-
classmethod
list_available_metrics
()¶ 返回类型: list
返回: a list of available metrics.
-
output_shape
¶ return -- output shape of a single sample of the task.
返回类型: tuple
-
classmethod
matchzoo.engine.hyper_spaces module¶
Hyper parameter search spaces wrapping hyperopt.
-
class
matchzoo.engine.hyper_spaces.
HyperoptProxy
(hyperopt_func, **kwargs)¶ 基类:
object
Hyperopt proxy class.
See hyperopt's documentation for more details: https://github.com/hyperopt/hyperopt/wiki/FMin
Reason of these wrappers:
A hyper space in hyperopt requires a label to instantiate. This label is used later as a reference to original hyper space that is sampled. In matchzoo, hyper spaces are used inmatchzoo.engine.Param
. Only if a hyper space's label matches its parentmatchzoo.engine.Param
's name, matchzoo can correctly back-refrenced the parameter got sampled. This can be done by asking the user always use the same name for a parameter and its hyper space, but typos can occur. As a result, these wrappers are created to hide hyper spaces' label, and always correctly bind them with its parameter's name.Example
>>> from hyperopt.pyll.stochastic import sample >>> numbers = [0, 1, 2] >>> sample(choice(options=numbers)('numbers')) in numbers True >>> 0 <= sample(quniform(low=0, high=9)('digit')) <= 9 True
-
class
matchzoo.engine.hyper_spaces.
choice
(options)¶ 基类:
matchzoo.engine.hyper_spaces.HyperoptProxy
hyperopt.hp.choice()
proxy.
-
class
matchzoo.engine.hyper_spaces.
quniform
(low, high, q=1)¶ 基类:
matchzoo.engine.hyper_spaces.HyperoptProxy
hyperopt.hp.quniform()
proxy.
-
class
matchzoo.engine.hyper_spaces.
uniform
(low, high)¶ 基类:
matchzoo.engine.hyper_spaces.HyperoptProxy
hyperopt.hp.uniform()
proxy.
matchzoo.engine.param module¶
Parameter class.
-
class
matchzoo.engine.param.
Param
(name, value=None, hyper_space=None, validator=None)¶ 基类:
object
Parameter class.
Basic usages with a name and value:
>>> param = Param('my_param', 10) >>> param.name 'my_param' >>> param.value 10
Use with a validator to make sure the parameter always keeps a valid value.
>>> param = Param( ... name='my_param', ... value=5, ... validator=lambda x: 0 < x < 20 ... ) >>> param.validator <function <lambda> at 0x...> >>> param.value 5 >>> param.value = 10 >>> param.value 10 >>> param.value = -1 Traceback (most recent call last): ... ValueError: Validator not satifised. The validator's definition is as follows: validator=lambda x: 0 < x < 20
Use with a hyper space. Setting up a hyper space for a parameter makes the parameter tunable in a
matchzoo.engine.Tuner
.>>> from matchzoo.engine.hyper_spaces import quniform >>> param = Param( ... name='positive_num', ... value=1, ... hyper_space=quniform(low=1, high=5) ... ) >>> param.hyper_space <hyperopt.pyll.base.Apply object at 0x...> >>> from hyperopt.pyll.stochastic import sample >>> samples = [sample(param.hyper_space) for _ in range(64)] >>> set(samples) == {1, 2, 3, 4, 5} True
The boolean value of a
Param
instance is only True when the value is not None. This is because some default falsy values like zero or an empty list are valid parameter values. In other words, the boolean value means to be "if the parameter value is filled".>>> param = Param('dropout') >>> if param: ... print('OK') >>> param = Param('dropout', 0) >>> if param: ... print('OK') OK
A _pre_assignment_hook is initialized as a data type convertor if the value is set as a number to keep data type consistency of the parameter. This conversion supports python built-in numbers, numpy numbers, and any number that inherits
numbers.Number
.>>> param = Param('float_param', 0.5) >>> param.value = 10 >>> param.value 10.0 >>> type(param.value) <class 'float'>
-
hyper_space
¶ return -- Hyper space of the parameter.
返回类型: Apply
-
name
¶ return -- Name of the parameter.
返回类型: str
-
validator
¶ return -- Validator of the parameter.
返回类型: Callable
[[Any
],bool
]
-
value
¶ return -- Value of the parameter.
返回类型: Any
-
matchzoo.engine.param_table module¶
Parameters table class.
-
class
matchzoo.engine.param_table.
ParamTable
¶ 基类:
object
Parameter table class.
Example
>>> params = ParamTable() >>> params.add(Param('ham', 'Parma Ham')) >>> params.add(Param('egg', 'Over Easy')) >>> params['ham'] 'Parma Ham' >>> params['egg'] 'Over Easy' >>> print(params) ham Parma Ham egg Over Easy >>> params.add(Param('egg', 'Sunny side Up')) Traceback (most recent call last): ... ValueError: Parameter named egg already exists. To re-assign parameter egg value, use `params["egg"] = value` instead.
-
completed
()¶ 返回类型: bool
返回: True if all params are filled, False otherwise. Example
>>> import matchzoo >>> model = matchzoo.models.NaiveModel() >>> model.params.completed() False >>> model.guess_and_fill_missing_params() >>> model.params.completed() True
-
hyper_space
¶ return -- Hyper space of the table, a valid hyperopt graph.
返回类型: dict
-
matchzoo.engine.tune module¶
Tuner class. Currently a minimum working demo.
-
matchzoo.engine.tune.
tune
(model, max_evals=32)¶ Tune the model max_evals times.
Construct a hyper parameter searching space by extracting all parameters in model that have a defined hyper space. Then, using hyperopt API, iteratively sample parameters and test for loss, and pick the best trial out of all. Currently a minimum working demo.
参数: - model (
BaseModel
) -- - max_evals (
int
) -- Number of evaluations of a single tuning process.
返回类型: list
返回: A list of trials of the tuning process.
Example
>>> from matchzoo.models import DenseBaselineModel >>> model = DenseBaselineModel() >>> max_evals = 4 >>> trials = tune(model, max_evals) >>> len(trials) == max_evals True
- model (