[텐서플로와 머신러닝으로 시작하는 자연어처리] 2장. 자연어 처리 개발 준비 (텐서플로, 사이킷런 등 라이브러리)

728x90

01. 텐서플로우

텐서플로우는 구글에서 오픈소스로 발표한 머신러닝 라이브러리임.

텐서(N차원 메트릭스) 플로우(Flow) 는 말그대로 데이터 흐름 그래프를 사용해 수치연산을 하는 과정을 의미.

tf.keras.layers 모듈

탠서플로를 이용해 딥러닝 모델을 만드는 것은 마치 블록을 하나씩 쌓아서 전체 구조를 만들어 가는 과정과 비슷.

그 블록역할을 하는 다양한 모듈중에는 케라스가 있는데

케라스는 텐서플로와 같은 별개의 딥러닝 오픈소스 이지만, 텐서플로에서도 케라스를 사용할 수 있게 지원해줌.

그 케라스를 모듈중 tf.keras.layers 모듈.

케라스는 텐서플로보다 사용하기 좀 더 직관적이고 쉽다는 장점이 있음

이 책에서는 모듈 위주로 설명할 것.

tf.keras.layers.Dense 모듈

Dense란 신경망 구조의 가장 기본적인 형태를 의미함.

즉 y = f(Wx + b) 함수를 의미.

INPUT_SIZE = (20, 1)
CONV_INPUT_SIZE = (1, 28, 28)
IS_TRAINING = True

input = tf.placeholder(tf.float32, shape = INPUT_SIZE)
output = tf.keras.layers.Dense(units = 10, activation = tf.nn.sigmoid)(input)

이런 식으로 사용할 수 있는데 Dense층을 만들 때 옵션으로 지정할 수 있는 인자는 아래와 같다.

__init__(
units,
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
)

units : 출력값의 크기
activation : 활성화함수
use_bias :편향을 사용할 지 여부

등이 있다.

Dense Layer with 1 hidden layer

10개의 노드를 가지는 은닉층이 있고 최종 출력은 2개의 노드를 가지는 경우>

input = tf.placeholder(tf.float32, shape = INPUT_SIZE)
hidden = tf.keras.layers.Dense(units = 10, activation = tf.nn.sigmoid)(input)
output = tf.keras.layers.Dense(units = 2, activation = tf.nn.sigmoid)(hidden)

Dropout

과적합을 방지하기 위해서 정규화 방법의 대표적인 방법인 드롭아웃을 사용하기 위한 모듈.

이 모듈을 사용하면 keras.layer의 입력값에 드롭아웃을 정요할 수 있다.

In [10]:

input = tf.placeholder(tf.float32, shape = INPUT_SIZE)

dropout = tf.keras.layers.Dropout(rate = 0.5)(input)

드롭아웃에서 설정할 수 있는 인자

rate : 드롭아웃을 적용할 비율. rate=0.2 로 지정시, 전체 입력값 중에서 20%를 0으로 만든다.
seed : 설정가능

Dense Layer with 1 hidden layer and dropout

위의 예제의 신경망 구조에서 드롭아웃을 적용하면 아래와 같다.

[7]

input = tf.placeholder(tf.float32, shape = INPUT_SIZE)

dropout = tf.keras.layers.Dropout(rate = 0.2)(input)

hidden = tf.keras.layers.Dense(units = 10, activation = tf.nn.sigmoid)(dropout)

output = tf.keras.layers.Dense(units = 2, activation = tf.nn.sigmoid)(hidden)

tf.keras.layers.Conv1D

: 합성곱(convolution) 연산 에는 conv1D, conv2D, conv3D 으로 나눠진다.

우리가 흔히 알고 있는 이미지에 적용하는 방식은 conv2D.

합성곱은 합성곱이 진행되는 방향과 합성곱 결과로 나오는 출렵값 두가지의 기준으로 분류할 수 있음.

자자연어처리 분야에서는 주로 conv1D를 사용!!

- 필터 : 합성곱을 수행할 부분을 말함.

합성곱도 핀터 크기, 개수 , 스트라이트 값 등을 인자로 설정할 수 있음

인자 값을 어떻게 설정하는 지에 따라 학습성능이 크게 달라지므로 어떤 것이 설정 가능한지, 각 인자가 의미가 무엇인지 정확하게 알고 있는 것이 중요.

__init__(
filters,
kernel_size,
strides=1,
padding='valid',
data_format='channels_last',
dilation_rate=1,
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
)

filters: 필터 갯수, 출력의 차원수를 의미
kernel_size: 필터의 크기, 리스트 혹은 튜플형태로 지정. 합성곱이 적용되는 윈도우의 길이를 나타냄
strides: 적용할 스트라이드의 값.
padding: 패딩 방법을 지정. - One of "valid", "causal" or "same" (case-insensitive). "causal" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t+1:]. Useful when modeling temporal data where the model should not violate the temporal order. See WaveNet: A Generative Model for Raw Audio, section 2.1.
data_format: 데이터의 표현 방법을 선택 - A string, one of channels_last (default) or channels_first.
dilation_rate: an integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1.
activation: 활성화 함수 - Activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
use_bias: 편향을 사용할 지의 여부 - Boolean, whether the layer uses a bias vector.
kernel_initializer: 가중치 초기화 함수 - Initializer for the kernel weights matrix.
bias_initializer: 편향 초기화 함수 - Initializer for the bias vector.
kernel_regularizer: 가중치 정규화 방법 - Regularizer function applied to the kernel weights matrix.
bias_regularizer: 편향 정규화 방법 - Regularizer function applied to the bias vector.
activity_regularizer: 출력값 정규화 방법 - Regularizer function applied to the output of the layer (its "activation")..
kernel_constraint: Constraint function applied to the kernel matrix.
bias_constraint: Constraint function applied to the bias vector.

참고한 링크 - https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv1D

conv1D의 필터는 가로길이만 설정하면 됨.

왜? (아직 합성곱 개념과 머신러닝 인풋 아웃풋 노드값 설정에 대한 개념이 부족한 것 같다. 자연어 처리 인강을 참고해서 머신러닝부분 부터 한번 다시 쭉 들어야 겠다.)

기본적인 사용법

Convolutional layer

[ ]

input = tf.placeholder(tf.float32, shape = CONV_INPUT_SIZE)

conv = tf.keras.layers.Conv1D( filters=10, # 필터 갯수

kernel_size=3,

padding='same',

activation=tf.nn.relu)(input)

Convolutional layer with dropout

드롭아웃을 적용한 합성곱 신경망

[ ]

input = tf.placeholder(tf.float32, shape = CONV_INPUT_SIZE)

dropout = tf.keras.layers.Dropout(rate=0.2)(input)conv = tf.keras.layers.Conv1D( filters=10, kernel_size=3, padding='same',

activation=tf.nn.relu)(dropout)

tf.keras.layers.MaxPol1D

합성곱 신경망에서 쓰이는 풀링 기법 중에는 맥스 풀링과 평균 풀링이 있음.

__init__(
pool_size=2,
strides=None,
padding='valid',
data_format='channels_last',
**kwargs
)

pool_size: Integer, size of the max pooling windows.
strides: Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size.
padding: One of "valid" or "same" (case-insensitive).
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, steps, features) while channels_first corresponds to inputs with shape (batch, features, steps).

Input -> Dropout -> Convolutional layer -> MaxPooling -> Dense layer with 1 hidden layer -> Output

[ ]

input = tf.placeholder(tf.float32, shape = CONV_INPUT_SIZE)

dropout = tf.keras.layers.Dropout(rate = 0.2)(input)conv = tf.keras.layers.Conv1D(

filters=10,

kernel_size=3,

padding='same',

activation=tf.nn.relu)(dropout)

max_pool = tf.keras.layers.MaxPool1D(pool_size = 3, padding = 'same')(conv)

flatten = tf.keras.layers.Flatten()(max_pool)hidden = tf.keras.layers.Dense(units = 50, activation = tf.nn.relu)(flatten)

output = tf.keras.layers.Dense(units = 10, activation = tf.nn.softmax)(hidden)

728x90

'자연어처리 > 텐서플로와 머신러닝으로 시작하는 자연어처리' 카테고리의 다른 글

[텐서플로와 머신러닝으로 시작하는 자연어처리] 3장. 자연어 처리 개요 (0)	2019.09.06
[텐서플로와 머신러닝으로 시작하는 자연어처리] 시작하기(커리큘럼) (0)	2019.09.06
[텐서플로와 머신러닝으로 시작하는 자연어처리] 1장. 들어가며 (0)	2019.08.13

mindsee Ai

[텐서플로와 머신러닝으로 시작하는 자연어처리] 2장. 자연어 처리 개발 준비 (텐서플로, 사이킷런 등 라이브러리)

Dense Layer with 1 hidden layer

Dropout

Dense Layer with 1 hidden layer and dropout

Convolutional layer

Convolutional layer with dropout

Input -> Dropout -> Convolutional layer -> MaxPooling -> Dense layer with 1 hidden layer -> Output

'자연어처리 > 텐서플로와 머신러닝으로 시작하는 자연어처리' 카테고리의 다른 글

댓글

티스토리툴바

[텐서플로와 머신러닝으로 시작하는 자연어처리] 2장. 자연어 처리 개발 준비 (텐서플로, 사이킷런 등 라이브러리)

Dense Layer with 1 hidden layer

Dropout

Dense Layer with 1 hidden layer and dropout

Convolutional layer

Convolutional layer with dropout

Input -> Dropout -> Convolutional layer -> MaxPooling -> Dense layer with 1 hidden layer -> Output

'자연어처리 > 텐서플로와 머신러닝으로 시작하는 자연어처리' 카테고리의 다른 글

관련글

댓글

티스토리툴바