File size: 2,392 Bytes
2d2c435
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import re
from typing import Dict, Any, Tuple
import dateparser
import pytz
import datetime


def parse_amount_and_currency(text: str) -> Tuple[str, float]:
    """Simple parser: looks for yen amounts like "¥30000" or "30000 yen" or numbers.
    Returns tuple (currency, amount)
    """
    # look for ¥ symbol
    m = re.search(r"¥\s?([0-9,]+)", text)
    if m:
        amt = float(m.group(1).replace(',', ''))
        return ("JPY", amt)
    m = re.search(r"([0-9,]+)\s*(yen|JPY)\b", text, flags=re.I)
    if m:
        amt = float(m.group(1).replace(',', ''))
        return ("JPY", amt)
    # fallback: any number
    m = re.search(r"([0-9,]+)", text)
    if m:
        return ("JPY", float(m.group(1).replace(',', '')))
    return ("", 0.0)


def extract_intent_and_slots(text: str) -> Dict[str, Any]:
    text_l = text.lower()
    result = {
        'intent': 'other',
        'nlu_confidence': 0.5,
        'slots': {}
    }

    # detect request for human
    if any(kw in text_l for kw in ['operator', 'human', 'representative', 'staff', 'talk to']):
        result['intent'] = 'request_human_operator'
        result['nlu_confidence'] = 0.9
        return result

    # detect payment commitment
    if any(kw in text_l for kw in ['pay', 'payment', 'i will pay', 'i can pay']):
        result['intent'] = 'payment_commitment'
        result['nlu_confidence'] = 0.85
        # amount
        currency, amount = parse_amount_and_currency(text)
        if amount > 0:
            result['slots']['amount'] = f"{int(amount)}"
        # date: use dateparser with Japan timezone
        settings = {'TIMEZONE': 'Asia/Tokyo', 'RETURN_AS_TIMEZONE_AWARE': True}
        # try to parse phrases like 'by next Friday' or 'by 2025-10-10'
        m = re.search(r"by\s+(.+)$", text, flags=re.I)
        parsed = None
        if m:
            parsed = dateparser.parse(m.group(1).strip(), settings=settings)
        if not parsed:
            # try to parse full sentence for a date
            parsed = dateparser.parse(text, settings=settings)
        if parsed:
            # normalize to Asia/Tokyo and isoformat
            tz = pytz.timezone('Asia/Tokyo')
            if parsed.tzinfo is None:
                parsed = tz.localize(parsed)
            else:
                parsed = parsed.astimezone(tz)
            result['slots']['date_by_when'] = parsed.isoformat()

    return result