matchzoo.engine package

Submodules

matchzoo.engine.base_model module

Base Model.

class matchzoo.engine.base_model.BaseModel(params=None, backend=None)

基类:abc.ABC

Abstract base class of all matchzoo models.

BACKEND_FILENAME = 'backend.h5'
PARAMS_FILENAME = 'params.dill'
backend

return model backend, a keras model instance.

返回类型:Model
build()

Build model, each sub class need to impelemnt this method.

Example

>>> BaseModel()  
Traceback (most recent call last):
...
TypeError: Can't instantiate abstract class BaseModel ...
>>> class MyModel(BaseModel):
...     def build(self):
...         pass
>>> MyModel
<class 'matchzoo.engine.base_model.MyModel'>
compile()

Compile model for training.

evaluate(x, y, batch_size=128, verbose=1)

Evaluate the model.

See keras.models.Model.evaluate() for more details.

参数:
  • x (Union[ndarray, List[ndarray]]) -- input data
  • y (ndarray) -- labels
  • batch_size (int) -- number of samples per gradient update
  • verbose (int) -- verbosity mode, 0 or 1
返回类型:

Union[float, List[float]]

返回:

scalar test loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.backend.metrics_names will give you the display labels for the scalar outputs.

fit(x, y, batch_size=128, epochs=1, verbose=1)

Fit the model.

See keras.models.Model.fit() for more details.

参数:
  • x (Union[ndarray, List[ndarray]]) -- input data.
  • y (ndarray) -- labels.
  • batch_size (int) -- number of samples per gradient update.
  • epochs (int) -- number of epochs to train the model.
  • verbose (int) -- 0, 1, or 2. Verbosity mode. 0 = silent, 1 = verbose, 2 = one log line per epoch.
返回类型:

History

返回:

A keras.callbacks.History instance. Its history attribute contains all information collected during training.

classmethod get_default_params()

Model default parameters.

The common usage is to instantiate matchzoo.engine.ModelParams
first, then set the model specific parametrs.

Examples

>>> class MyModel(BaseModel):
...     def build(self):
...         print(self._params['num_eggs'], 'eggs')
...         print('and', self._params['ham_type'])
...
...     @classmethod
...     def get_default_params(cls):
...         params = engine.ParamTable()
...         params.add(engine.Param('num_eggs', 512))
...         params.add(engine.Param('ham_type', 'Parma Ham'))
...         return params
>>> my_model = MyModel()
>>> my_model.build()
512 eggs
and Parma Ham

Notice that all parameters must be serialisable for the entire model to be serialisable. Therefore, it's strongly recommended to use python native data types to store parameters.

返回类型:ParamTable
返回:model parameters
guess_and_fill_missing_params()

Guess and fill missing parameters in params.

Note: likely to be moved to a higher level API in the future.

params

return -- model parameters.

返回类型:ParamTable
predict(x, batch_size=128)

Generate output predictions for the input samples.

See keras.models.Model.predict() for more details.

参数:
  • x (Union[ndarray, List[ndarray]]) -- input data
  • batch_size -- number of samples per gradient update
返回类型:

ndarray

返回:

numpy array(s) of predictions

save(dirpath)

Save the model.

A saved model is represented as a directory with two files. One is a model parameters file saved by pickle, and the other one is a model h5 file saved by keras.

参数:dirpath (Union[str, Path]) -- directory path of the saved model
matchzoo.engine.base_model.load_model(dirpath)

Load a model. The reverse function of BaseModel.save().

参数:dirpath (Union[str, Path]) -- directory path of the saved model
返回类型:BaseModel
返回:a BaseModel instance

matchzoo.engine.base_preprocessor module

Base Preprocessor, consist of multiple ProcessorUnit.

Each sub-class should employ a sequence of ProcessorUnit and StatefulProcessorUnit to handle input data.

class matchzoo.engine.base_preprocessor.BasePreprocessor

基类:object

Abstract base class for model-wise processors.

fit_transform(text_left, text_right, labels)

Apply fit-transform on input data.

This method is an abstract method, need to be implemented in sub-class.

handle(process_unit, input)

Inference whether a process_unit is Stateful.

参数:
  • process_unit (ProcessorUnit) -- Given a process unit instance.
  • input (Any) -- process input text.
Return ctx:

Context as dict, i.e. fitted parameters.

返回类型:

Union[dict, Any]

返回:

Transformed user input given transformer.

matchzoo.engine.base_task module

Base task.

class matchzoo.engine.base_task.BaseTask

基类:abc.ABC

Base Task, shouldn't be used directly.

classmethod list_available_losses()
返回类型:list
返回:a list of available losses.
classmethod list_available_metrics()
返回类型:list
返回:a list of available metrics.
output_shape

return -- output shape of a single sample of the task.

返回类型:tuple
matchzoo.engine.base_task.list_available_tasks(base=<class 'matchzoo.engine.base_task.BaseTask'>)
返回类型:List[Type[BaseTask]]
返回:a list of available task types.

matchzoo.engine.hyper_spaces module

Hyper parameter search spaces wrapping hyperopt.

class matchzoo.engine.hyper_spaces.HyperoptProxy(hyperopt_func, **kwargs)

基类:object

Hyperopt proxy class.

See hyperopt's documentation for more details: https://github.com/hyperopt/hyperopt/wiki/FMin

Reason of these wrappers:

A hyper space in hyperopt requires a label to instantiate. This label is used later as a reference to original hyper space that is sampled. In matchzoo, hyper spaces are used in matchzoo.engine.Param. Only if a hyper space's label matches its parent matchzoo.engine.Param's name, matchzoo can correctly back-refrenced the parameter got sampled. This can be done by asking the user always use the same name for a parameter and its hyper space, but typos can occur. As a result, these wrappers are created to hide hyper spaces' label, and always correctly bind them with its parameter's name.

Example

>>> from hyperopt.pyll.stochastic import sample
>>> numbers = [0, 1, 2]
>>> sample(choice(options=numbers)('numbers')) in numbers
True
>>> 0 <= sample(quniform(low=0, high=9)('digit')) <= 9
True
class matchzoo.engine.hyper_spaces.choice(options)

基类:matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.choice() proxy.

class matchzoo.engine.hyper_spaces.quniform(low, high, q=1)

基类:matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.quniform() proxy.

class matchzoo.engine.hyper_spaces.uniform(low, high)

基类:matchzoo.engine.hyper_spaces.HyperoptProxy

hyperopt.hp.uniform() proxy.

matchzoo.engine.param module

Parameter class.

class matchzoo.engine.param.Param(name, value=None, hyper_space=None, validator=None)

基类:object

Parameter class.

Basic usages with a name and value:

>>> param = Param('my_param', 10)
>>> param.name
'my_param'
>>> param.value
10

Use with a validator to make sure the parameter always keeps a valid value.

>>> param = Param(
...     name='my_param',
...     value=5,
...     validator=lambda x: 0 < x < 20
... )
>>> param.validator  
<function <lambda> at 0x...>
>>> param.value
5
>>> param.value = 10
>>> param.value
10
>>> param.value = -1
Traceback (most recent call last):
    ...
ValueError: Validator not satifised.
The validator's definition is as follows:
validator=lambda x: 0 < x < 20

Use with a hyper space. Setting up a hyper space for a parameter makes the parameter tunable in a matchzoo.engine.Tuner.

>>> from matchzoo.engine.hyper_spaces import quniform
>>> param = Param(
...     name='positive_num',
...     value=1,
...     hyper_space=quniform(low=1, high=5)
... )
>>> param.hyper_space  
<hyperopt.pyll.base.Apply object at 0x...>
>>> from hyperopt.pyll.stochastic import sample
>>> samples = [sample(param.hyper_space) for _ in range(64)]
>>> set(samples) == {1, 2, 3, 4, 5}
True

The boolean value of a Param instance is only True when the value is not None. This is because some default falsy values like zero or an empty list are valid parameter values. In other words, the boolean value means to be "if the parameter value is filled".

>>> param = Param('dropout')
>>> if param:
...     print('OK')
>>> param = Param('dropout', 0)
>>> if param:
...     print('OK')
OK

A _pre_assignment_hook is initialized as a data type convertor if the value is set as a number to keep data type consistency of the parameter. This conversion supports python built-in numbers, numpy numbers, and any number that inherits numbers.Number.

>>> param = Param('float_param', 0.5)
>>> param.value = 10
>>> param.value
10.0
>>> type(param.value)
<class 'float'>
hyper_space

return -- Hyper space of the parameter.

返回类型:Apply
name

return -- Name of the parameter.

返回类型:str
validator

return -- Validator of the parameter.

返回类型:Callable[[Any], bool]
value

return -- Value of the parameter.

返回类型:Any

matchzoo.engine.param_table module

Parameters table class.

class matchzoo.engine.param_table.ParamTable

基类:object

Parameter table class.

Example

>>> params = ParamTable()
>>> params.add(Param('ham', 'Parma Ham'))
>>> params.add(Param('egg', 'Over Easy'))
>>> params['ham']
'Parma Ham'
>>> params['egg']
'Over Easy'
>>> print(params)
ham                           Parma Ham
egg                           Over Easy
>>> params.add(Param('egg', 'Sunny side Up'))
Traceback (most recent call last):
    ...
ValueError: Parameter named egg already exists.
To re-assign parameter egg value, use `params["egg"] = value` instead.
add(param)
参数:param (Param) -- parameter to add.
completed()
返回类型:bool
返回:True if all params are filled, False otherwise.

Example

>>> import matchzoo
>>> model = matchzoo.models.NaiveModel()
>>> model.params.completed()
False
>>> model.guess_and_fill_missing_params()
>>> model.params.completed()
True
hyper_space

return -- Hyper space of the table, a valid hyperopt graph.

返回类型:dict

matchzoo.engine.tune module

Tuner class. Currently a minimum working demo.

matchzoo.engine.tune.tune(model, max_evals=32)

Tune the model max_evals times.

Construct a hyper parameter searching space by extracting all parameters in model that have a defined hyper space. Then, using hyperopt API, iteratively sample parameters and test for loss, and pick the best trial out of all. Currently a minimum working demo.

参数:
  • model (BaseModel) --
  • max_evals (int) -- Number of evaluations of a single tuning process.
返回类型:

list

返回:

A list of trials of the tuning process.

Example

>>> from matchzoo.models import DenseBaselineModel
>>> model = DenseBaselineModel()
>>> max_evals = 4
>>> trials = tune(model, max_evals)
>>> len(trials) == max_evals
True

Module contents