matchzoo.engine package¶

Submodules¶

matchzoo.engine.base_model module¶

Base Model.

class matchzoo.engine.base_model.BaseModel(params=None, backend=None)¶

基类：abc.ABC

Abstract base class of all matchzoo models.

BACKEND_FILENAME = 'backend.h5'¶

PARAMS_FILENAME = 'params.dill'¶

backend¶

return model backend, a keras model instance.

返回类型:	`Model`

build()¶

Build model, each sub class need to impelemnt this method.

Example

>>> BaseModel()  
Traceback (most recent call last):
...
TypeError: Can't instantiate abstract class BaseModel ...
>>> class MyModel(BaseModel):
...     def build(self):
...         pass
>>> MyModel
<class 'matchzoo.engine.base_model.MyModel'>

compile()¶: Compile model for training.

evaluate(x, y, batch_size=128, verbose=1)¶

Evaluate the model.

See keras.models.Model.evaluate() for more details.

参数:	x (`Union`[`ndarray`, `List`[`ndarray`]]) -- input data y (`ndarray`) -- labels batch_size (`int`) -- number of samples per gradient update verbose (`int`) -- verbosity mode, 0 or 1
返回类型:	`Union`[`float`, `List`[`float`]]
返回:	scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.backend.metrics_names will give you the display labels for the scalar outputs.

fit(x, y, batch_size=128, epochs=1, verbose=1)¶

Fit the model.

See keras.models.Model.fit() for more details.

参数:	x (`Union`[`ndarray`, `List`[`ndarray`]]) -- input data. y (`ndarray`) -- labels. batch_size (`int`) -- number of samples per gradient update. epochs (`int`) -- number of epochs to train the model. verbose (`int`) -- 0, 1, or 2. Verbosity mode. 0 = silent, 1 = verbose, 2 = one log line per epoch.
返回类型:	`History`
返回:	A keras.callbacks.History instance. Its history attribute contains all information collected during training.

classmethod get_default_params()¶

Model default parameters.

The common usage is to instantiate matchzoo.engine.ModelParams: first, then set the model specific parametrs.

Examples

>>> class MyModel(BaseModel):
...     def build(self):
...         print(self._params['num_eggs'], 'eggs')
...         print('and', self._params['ham_type'])
...
...     @classmethod
...     def get_default_params(cls):
...         params = engine.ParamTable()
...         params.add(engine.Param('num_eggs', 512))
...         params.add(engine.Param('ham_type', 'Parma Ham'))
...         return params
>>> my_model = MyModel()
>>> my_model.build()
512 eggs
and Parma Ham

Notice that all parameters must be serialisable for the entire model to be serialisable. Therefore, it's strongly recommended to use python native data types to store parameters.

返回类型:	`ParamTable`
返回:	model parameters

guess_and_fill_missing_params()¶

Guess and fill missing parameters in params.

Note: likely to be moved to a higher level API in the future.

params¶

return -- model parameters.

返回类型:	`ParamTable`

predict(x, batch_size=128)¶

Generate output predictions for the input samples.

See keras.models.Model.predict() for more details.

参数:	x (`Union`[`ndarray`, `List`[`ndarray`]]) -- input data batch_size -- number of samples per gradient update
返回类型:	`ndarray`
返回:	numpy array(s) of predictions

save(dirpath)¶

Save the model.

A saved model is represented as a directory with two files. One is a model parameters file saved by pickle, and the other one is a model h5 file saved by keras.

参数:	dirpath (`Union`[`str`, `Path`]) -- directory path of the saved model

matchzoo.engine.base_model.load_model(dirpath)¶

Load a model. The reverse function of BaseModel.save().

参数:	dirpath (`Union`[`str`, `Path`]) -- directory path of the saved model
返回类型:	`BaseModel`
返回:	a `BaseModel` instance

matchzoo.engine.base_preprocessor module¶

Base Preprocessor, consist of multiple ProcessorUnit.

Each sub-class should employ a sequence of ProcessorUnit and StatefulProcessorUnit to handle input data.

class matchzoo.engine.base_preprocessor.BasePreprocessor¶

基类：object

Abstract base class for model-wise processors.

fit_transform(text_left, text_right, labels)¶

Apply fit-transform on input data.

This method is an abstract method, need to be implemented in sub-class.

handle(process_unit, input)¶

Inference whether a process_unit is Stateful.

参数:	process_unit (`ProcessorUnit`) -- Given a process unit instance. input (`Any`) -- process input text.
Return ctx:	Context as dict, i.e. fitted parameters.
返回类型:	`Union`[`dict`, `Any`]
返回:	Transformed user input given transformer.

matchzoo.engine.base_task module¶

Base task.

class matchzoo.engine.base_task.BaseTask¶

基类：abc.ABC

Base Task, shouldn't be used directly.

classmethod list_available_losses()¶

返回类型:	`list`
返回:	a list of available losses.

classmethod list_available_metrics()¶

返回类型:	`list`
返回:	a list of available metrics.

output_shape¶

return -- output shape of a single sample of the task.

返回类型:	`tuple`

matchzoo.engine.base_task.list_available_tasks(base=<class 'matchzoo.engine.base_task.BaseTask'>)¶

返回类型:	`List`[`Type`[`BaseTask`]]
返回:	a list of available task types.

matchzoo.engine.hyper_spaces module¶

Hyper parameter search spaces wrapping hyperopt.

class matchzoo.engine.hyper_spaces.HyperoptProxy(hyperopt_func, **kwargs)¶

基类：object

Hyperopt proxy class.

See hyperopt's documentation for more details: https://github.com/hyperopt/hyperopt/wiki/FMin

Reason of these wrappers:

A hyper space in hyperopt requires a label to instantiate. This label is used later as a reference to original hyper space that is sampled. In matchzoo, hyper spaces are used in matchzoo.engine.Param. Only if a hyper space's label matches its parent matchzoo.engine.Param's name, matchzoo can correctly back-refrenced the parameter got sampled. This can be done by asking the user always use the same name for a parameter and its hyper space, but typos can occur. As a result, these wrappers are created to hide hyper spaces' label, and always correctly bind them with its parameter's name.

Example

>>> from hyperopt.pyll.stochastic import sample
>>> numbers = [0, 1, 2]
>>> sample(choice(options=numbers)('numbers')) in numbers
True
>>> 0 <= sample(quniform(low=0, high=9)('digit')) <= 9
True

class matchzoo.engine.hyper_spaces.choice(options)¶

基类：matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.choice() proxy.

class matchzoo.engine.hyper_spaces.quniform(low, high, q=1)¶

基类：matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.quniform() proxy.

class matchzoo.engine.hyper_spaces.uniform(low, high)¶

基类：matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.uniform() proxy.

matchzoo.engine.param module¶

Parameter class.

class matchzoo.engine.param.Param(name, value=None, hyper_space=None, validator=None)¶

基类：object

Parameter class.

Basic usages with a name and value:

>>> param = Param('my_param', 10)
>>> param.name
'my_param'
>>> param.value
10

Use with a validator to make sure the parameter always keeps a valid value.

>>> param = Param(
...     name='my_param',
...     value=5,
...     validator=lambda x: 0 < x < 20
... )
>>> param.validator  
<function <lambda> at 0x...>
>>> param.value
5
>>> param.value = 10
>>> param.value
10
>>> param.value = -1
Traceback (most recent call last):
    ...
ValueError: Validator not satifised.
The validator's definition is as follows:
validator=lambda x: 0 < x < 20

Use with a hyper space. Setting up a hyper space for a parameter makes the parameter tunable in a matchzoo.engine.Tuner.

>>> from matchzoo.engine.hyper_spaces import quniform
>>> param = Param(
...     name='positive_num',
...     value=1,
...     hyper_space=quniform(low=1, high=5)
... )
>>> param.hyper_space  
<hyperopt.pyll.base.Apply object at 0x...>
>>> from hyperopt.pyll.stochastic import sample
>>> samples = [sample(param.hyper_space) for _ in range(64)]
>>> set(samples) == {1, 2, 3, 4, 5}
True

The boolean value of a Param instance is only True when the value is not None. This is because some default falsy values like zero or an empty list are valid parameter values. In other words, the boolean value means to be "if the parameter value is filled".

>>> param = Param('dropout')
>>> if param:
...     print('OK')
>>> param = Param('dropout', 0)
>>> if param:
...     print('OK')
OK

A _pre_assignment_hook is initialized as a data type convertor if the value is set as a number to keep data type consistency of the parameter. This conversion supports python built-in numbers, numpy numbers, and any number that inherits numbers.Number.

>>> param = Param('float_param', 0.5)
>>> param.value = 10
>>> param.value
10.0
>>> type(param.value)
<class 'float'>

hyper_space¶

return -- Hyper space of the parameter.

返回类型:	`Apply`

name¶

return -- Name of the parameter.

返回类型:	`str`

validator¶

return -- Validator of the parameter.

返回类型:	`Callable`[[`Any`], `bool`]

value¶

return -- Value of the parameter.

返回类型:	`Any`

matchzoo.engine.param_table module¶

Parameters table class.

class matchzoo.engine.param_table.ParamTable¶

基类：object

Parameter table class.

Example

>>> params = ParamTable()
>>> params.add(Param('ham', 'Parma Ham'))
>>> params.add(Param('egg', 'Over Easy'))
>>> params['ham']
'Parma Ham'
>>> params['egg']
'Over Easy'
>>> print(params)
ham                           Parma Ham
egg                           Over Easy
>>> params.add(Param('egg', 'Sunny side Up'))
Traceback (most recent call last):
    ...
ValueError: Parameter named egg already exists.
To re-assign parameter egg value, use `params["egg"] = value` instead.

add(param)¶

参数:	param (`Param`) -- parameter to add.

completed()¶

返回类型:	`bool`
返回:	True if all params are filled, False otherwise.

Example

>>> import matchzoo
>>> model = matchzoo.models.NaiveModel()
>>> model.params.completed()
False
>>> model.guess_and_fill_missing_params()
>>> model.params.completed()
True

hyper_space¶

return -- Hyper space of the table, a valid hyperopt graph.

返回类型:	`dict`

matchzoo.engine.tune module¶

Tuner class. Currently a minimum working demo.

matchzoo.engine.tune.tune(model, max_evals=32)¶

Tune the model max_evals times.

Construct a hyper parameter searching space by extracting all parameters in model that have a defined hyper space. Then, using hyperopt API, iteratively sample parameters and test for loss, and pick the best trial out of all. Currently a minimum working demo.

参数:	model (`BaseModel`) -- max_evals (`int`) -- Number of evaluations of a single tuning process.
返回类型:	`list`
返回:	A list of trials of the tuning process.

Example

>>> from matchzoo.models import DenseBaselineModel
>>> model = DenseBaselineModel()
>>> max_evals = 4
>>> trials = tune(model, max_evals)
>>> len(trials) == max_evals
True

matchzoo.engine package¶

Submodules¶

matchzoo.engine.base_model module¶

matchzoo.engine.base_preprocessor module¶

matchzoo.engine.base_task module¶

matchzoo.engine.hyper_spaces module¶

matchzoo.engine.param module¶

matchzoo.engine.param_table module¶

matchzoo.engine.tune module¶

Module contents¶